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Description 

The present invention relates to the field of molecular biology. In particular, it relates to, among other things, nu- 
cleotide sequences of Staphylococcus aureus, contigs, ORFs, fragments, probes, primers and related polynucleotides 
£ thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, 
such as in fermentation, polypeptide production, assays and pharmaceutical development, among others. 

The genus Staphylococcus includes at least 20 distinct species. (For a review see Novick, R. R, The Staphyloco- 
ccus as a Molecular Genetic System, Chapter 1, pgs. 1-37 in MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, 
R. Novick, Ed., VCH Publishers, New York (1990)). Species differ from one another by 80% or more, by hybridization 
'0 kinetics, whereas strains within a species are at least 90% identical by the same measure. 

The species Staphylococcus aureus, a gram -positive, facultatively aerobic, clump-forming cocci, is among the 
most important etiological agents of bacterial infection in humans, as discussed briefly below. 

Human Health and S. Aureus 

15 

Staphylococcus aureus is a ubiquitous pathogen. (See, for instance, Mims et ai, MEDICAL MICROBIOLOGY, 
Mosby-Year Book Europe Limited, London, UK (1993)). It is an etiological agent of a variety of conditions, ranging in 
severity from mild to fatal. A few of the more common conditions caused by S. aureus infection are burns, cellulitis, 
eyelid infections, food poisoning, joint infections, neonatal conjunctivitis,osteomyelitis, skin infections, surgical wound 
20 infection, scalded skin syndrome and toxic shock syndrome, some of which are described further below. 

Bums 

Burn wounds generally are sterile initially. However, they generally compromise physical and immune barriers to 
25 infection, cause loss of fluid and electrolytes and result in local or general physiological dysfunction. After cooling, 
contact with viable bacteria results in mixed colonization at the injury site. Infection may be restricted to the non-viable 
debris on the burn surface ("eschar"), it may progress into full skin infection and invade viable tissue below the eschar 
and it may reach below the skin, enter the lymphatic and blood circulation and develop into septicaemia. S. aureus is 
among the most important pathogens typically found in burn wound infections. It can destroy granulation tissue and 
30 produce severe septicaemia. 

Cellulitis 

Cellulitis, an acute infection of the skin that expands from a typically superficial origin to spread below the cutaneous 
35 layer, most commonly is caused by S. aureus in conjunction with S. pyrogenes. Cellulitis can lead to systemic infection. 
In fact, cellulitis can be one aspect of synergistic bacterial gangrene. This condition typically is caused by a mixture of 
S. aureus and microaerophilic streptococci It causes necrosis and treatment is limited to excision of the necrotic tissue. 
The condition often is fatal. 

40 Eyelid infections 

S. aureus is the cause of styes and of sticky eye" in neonates, among other eye infections. Typically such infections 
are limited to the surface of the eye, and may occasionally penetrate the surface with more severe consequences. 

45 Food poisoning 

Some strains of S. aureus produce one or more of five serologically distinct, heat and acid stable enterotoxins that 
are not destroyed by digestive process of the stomach and small intestine (enterotoxins A-E). Ingestion of the toxin, 
in sufficient quantities, typically results in severe vomiting, but not diarrhoea. The effect does not require viable bacteria. 
50 Although the toxins are known, their mechanism of action is not understood. 

Joint infections 

S. aureus infects bone joints causing diseases such osteomyelitis. 

55 

Osteomyelitis 

S. aureus is the most common causative agent of haematogenous osteomyelitis. The disease tends to occur in 
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children and adolescents more than adults and it is associated with non-penetrating injuries to bones. Infection typically 
occurs in the long end of growing bone, hence its occurrence in physically immature populations. Most often, infection 
is localized in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long, growing 
bones. 

Skin infections 

S. aureus is the most common pathogen of such minor skin infections as abscesses and boils. Such infections 
often are resolved by normal host response mechanisms, but they also can develop into severe internal infections. 
Recurrent infections of the nasal passages plague nasal carriers of S. aureus. 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body. Infection of such wound thus poses a grave risk to the patient. 
S. aureus is the most important causative agent of infections in surgical wounds. S. aureus is unusually adept at 
invading surgical wounds; sutured wounds can be infected by far fewer S. aureus cells then are necessary to cause 
infection in normal skin. Invasion of surgical wound can lead to severe S. aureus septicaemia. Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone, causing 
systemic diseases, such as endocarditis and osteomyelitis. 

Scalded Skin Syndrome 

S. aureus is responsible for "scalded skin syndrome" (also called toxic epidermal necrosis, Ritter's disease and 
Lyell's disease). This diseases occurs in older children, typically in outbreaks caused by flowering of S. aureus strains 
produce exfoliation(also called scalded skin syndrome toxin). Although the bacteria initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases. Shedding of the outer layer of skin 
generally reveals normal skin below, but fluid lost in the process can produce severe injury in young children if it is not 
treated properly. 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S. aureus that produce the so-called toxic shock syndrome toxin. 
The disease can be caused by S. aureus infection at any site, but it is too often erroneously viewed exclusively as a 
disease solely of women who use tampons. The disease involves toxaemia and septicaemia, and can be fatal. 

Nocosomial Infections 

In the 1984 National Nocosomial Infection Surveillance Study ("NNIS") S. aureus was the most prevalent agent 
of surgical wound infections in many hospital services, including medicine, surgery, obstetrics, pediatrics and newborns. 

Resistance to drugs of S. aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S. aureus was unfavorable. 
Following the introduction of penicillin in the early 1 940s even the worst S. aureus infections generally could be treated 
successfully. The emergence of penicillin-resistant strains of S. aureus did not take long, however. Most strains of S. 
aureus encountered in hospital infections today do not respond to penicillin; although, fortunately, this is not the case 
for S. aureus encountered in community infections. 

It is well known now that penicillin-resistant strains of S. aureus produce a lactamase which converts penicillin to 
pencillinoic acid, and thereby destroys antibiotic activity. Furthermore, the lactamase gene often is propagated episo- 
malty, typically on a plasmid, and often is only one of several genes on an episomal element that, together, confer 
multidrug resistance. 

Methicillins, introduced in the 1960s, largely overcame the problem of penicillin resistance in S. aureus. These 
compounds conserve the portions of penicillin responsible for antibiotic activity and modify or alter other portions that 
make penicillin a good substrate for inactivating lactamases. However, methicillin resistance has emerged in S. aureus, 
along with resistance to many other antibiotics effective against this organism, including aminoglycosides, tetracycline, 
chloramphenicol, macrolides and lincosamides. In fact, methicillin-resistant strains of S. aureus generally are multiply 
drug resistant. 
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The molecular genetics of most types of drug resistance in S. aureus has been elucidated (See Lyon et ai, Micro- 
biology Reviews 51 : 88-1 34 (1 987)). Generally, resistance is mediated by plasmids, as noted above regarding penicillin 
resistance; however, several stable forms of drug resistance have been observed that apparently involve integration 
of a resistance element into the S. aureus genome itself. 
5 Thus far each new antibiotic gives rise to resistance strains, stains emerge that are resistance to multiple drugs 

and increasingly persistent forms of resistance begin to emerge. Drug resistance of S. aureus infections already poses 
significant treatment difficulties, which are likely to get much worse unless new therapeutic agents are developed. 

Molecular Genetics of Staphylococcus Aureus 

10 

Despite its importance in, among other things, human disease, relatively little is known about the genome of this 
organism. 

Most genetic studies of S. aureus have been carried out using the the strain NCTC8325, which contains prophages 
psi11 psi12 and psi13, and the UV-cured derivative of this strain, 8325-4 (also referred to as RN450), which is free of 
*5 the prophages. 

These studies revealed that the S. aureus genome, like that of other staphylococci, consists of one circular, cov- 
alently closed, double-stranded DNA and a collection of so-called variable accessory genetic elements, such as 
prophages, plasmids, transposons and the like. 

Physical characterization of the genome has not been carried out in any detail. Pattee et ai published a low res- 
20 olution and incomplete genetic and physical map of the chromosome of S. aureus strain NCTC 8325. (Pattee et ai 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325, Chapter 11, pgs. 163-169 in. 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, R.P. Novick, Ed., VCH Publishers, New York, (1 990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn4001 , which, respectively, confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE"). 
2S The map was of low resolution; even estimating the physical size of the genome was difficult, according to the 

investigators. The size of the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE. To estimate its size, additional restriction sites had to be introduced into the chromosome using a transposon 
containing a Smal recognition sequence. 

In sum, most physical characteristics and almost all of the genes of Staphylococcus aureus are unknown. Among 
30 the few genes that have been identified, most have not been physically mapped or characterized in detail. Only a very 
few genes of this organism have been sequenced. (See, for instance Thornsberry, J. , Antimicrobial Chemotherapy '21 
Suppl C : 9-16 (1988), current versions of GENBANK and other nucleic acid databases, and references that relate to 
the genome of S. aureus such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by S. aureus infection involves the programmed 
35 expression of S. aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding of the organism and its host interactions. Knowledge of S. aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of 
S. aureus would provide reagents for, among other things, detecting, characterizing and controllings, aureus infections. 
\ 40 There is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
^ organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome. The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS: 1-5,191. 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
45 genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embod- 
iment, the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS:1-5,191. 

The present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most 
so preferably 99.9%, identical to the nucleotide sequences of SEQ ID NOS:1 -5, 191. 

The nucleotide sequence of SEQ ID NOS:1-5,191, a representative fragment thereof, or a nucleotide sequence 
which is at least 95%, preferably 99% and most preferably 99.9%, identical to the nucleotide sequence of SEQ ID 
NOS: 1-5, 191 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the 
sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited 
55 to:magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media 
such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/ 
optical storage media. 

The present invention further provides systems, particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means. Such systems are designed to identify commercially 
important fragments of the Staphylococcus aureus genome. 

Another embodiment of the present invention is directed to fragments, preferably isolated fragments, of the Sta- 
phylococcus aureus genome having particular structural or functional attributes. Such fragments of the Staphylococcus 
s aureus genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter 
referred to as open reading frames or ORFs," fragments which modulate the expression of an operably linked ORF, 
hereinafter referred to as expression modulating fragments or EMFs," and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or "DFs." 

Each of the ORFs in fragments of the Staphylococcus aureus genome disclosed in Tables 1-3, and the EMFs 
10 found 5' to the ORFs, can be used in numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in 
a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity. 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
15 lococcus aureus genome of the present invention. The recombinant constructs of the present invention comprise vec- 
tors, such as a plasmid or viral vector, into which a fragment of the Staphylococcus aureus has been inserted. 

The present invention further provides host cells containing any of the isolated fragments of the Staphylococcus 
aureus genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell. 
20 The present invention is further directed to polypeptides and proteins, preferably isolated polypeptides and pro- 

teins, encoded by ORFs of the present invention. A variety of methods, well known to those of skill in the art, routinely 
may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and 
proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may 
25 be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them. 

The invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides. Also provided are methods for vacciniating an in- 
dividual against Staphylococcus aureus infection. 
30 The invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus 

genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specif- 
ically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques 
such as PGR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. 
35 Such antibodies include both monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is an 
immortalized cell line which is capable of secreting a specific monoclonal antibody. 

The present invention further provides methods of identifying test samples derived from cells which express one 
of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one 
40 or more of the antibodies of the present invention, or one or more of the Dfs or antigens of the present invention, under 
conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
45 which comprises: (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
invention; and (b) one or more other containers comprising one or more of the following:wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybridized DFs. 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present 
50 invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharma- 
ceutical agents and the like. Such methods comprise steps of: (a)contacting an agent with an isolated protein encoded 
by one of the ORFs of the present invention; and (b)determining whether the agent binds to said protein. 

The present genomic sequences of Staphylococcus aureus will be of great value to all. laboratories working with 
this organism and for a variety of commercial purposes. Many fragments of the Staphylococcus aureus genome will 
55 be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value 
to Staphylococcus aureus researchers and for immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology for elucidating extensive genomic sequences of bacterial and other genomes 
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has and will great* enhance the ability to analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function 
including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of 
regJIaSy events, the identification of genes with potential industria. applications, and the ability to do comparative 
genomic and molecular phylogeny. ,.,K.,^ 1! ,«i. m c 
FIGURE 1 is a block diagram of a computer system (102) that can be used to implement computer-based systems 

ot present invention. tx ... _ 

FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble edit 
and annotate the contigs of the Staphylococcus aureus genome of the present invention. Both Macintosh and Unoc 
platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et al Pro- 
ceedings of the Twenty-Sixth Annual Hawaii International Conference on System Scmnces, 585, IEEE Computer- So- 
ciety Press Washington D.C. (1993). Factura (AB) is a Macintosh program designed for automatic vector sequence 
removal and end-trimming of sequence files. The program Loadis runs on a Macintosh pbtform and parses the feature 
data extracted from the sequence files by Factura to the Un« based Staphylococcus aureus rela .onal database. As- 
sembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and 
their associated features using extrseq, a Unix utility for retrieving sequences from an SQL database. The resulting 
sequence file is processed by seqjilter to trim portions of the sequences with more than 2% ambiguous nucleoli** 
The sequence files were assembled using TIGR Assembler, an assembry engine designed at The Instriutefor Genomic 
Research ( TIGR") for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs 
20 generated by the assembly step is loaded into the database with the lassie program. Identification of open reading 
frames (ORFs) is accomplished by processing contigs with zorf . The ORFs are searched against S. aureus sea^ences 
from Genbank and against all protein sequences using the BLASTN and BLASTP programs, described ,n Altschul et 
al. J Mol Biol. 215 403-410 (1990)). Results of the ORF determination and similarity searching steps were loaded 
into the databases described below, some results of the determination and the searches are set out in Tables 1-3.. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome and anatysis 
of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID 
NOS: 1-5,1 91. (As used herein, the 'primary sequence" refers to the nucleotide sequence represented by the IUPAC 
nomenclature system.) 

In addition to the aforementioned Staphylococcus aureus polynucleotide and polynucleotide sequences the 
present invention provides the nucleotide sequences of SEQ ID NOS:1-5,191 , or representative fragments thereol, in 
a form which can be readily used, analyzed, and interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS.1-5,191 refers 
to any portion of the SEQ ID NOS:1-5,191 which is not presently represented within a publicly available database- 
Preferred representative fragments of the present invention are Staphylococcus aureus open read.ng frames ( ORFs ), 
35 expression modulating fragment ( EMFs")and fragments which can be used to diagnose the presence rtStaphyloco- 
ccus aureus in sample CDFs'). A non-limiting identification of preferred representative fragments is provided in Tables 

1 " 3 As discussed in detail below, the information provided in SEQ ID NOS:1-5,191 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence 
40 all "representative fragments" of interest, including open reading frames encoding a large variety of Staphylococcus 

aUre WhHe°he n p S resently disclosed sequences of SEQ ID NOS:1 -5,191 are highly accurate, sequencing techniques are 
not perfect and, in relatively rare instances, further investigation of a fragment or sequence of Ihe invention may reveal 
a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS:1-5,191 . However, once the 

« present invention is made availab.e (Lb., once the information in SEQ ID NOS:1 -5,1 91 and Tables 1 1-3 has been made 
available), resolving a rare sequencing error in SEQ ID NOS:1-5,191 will be well within the skill of the art. The i present 
disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to 
be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide 
may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous ,n the 

so art Nucleotide sequence editing software is publicly available. For example, Applied Biosystem s (AB) AutoAssemb e 
can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential 
errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing 
effort also of a routine nature, to the region containing the potential error. 

Even if all of the very rare sequencing errors in SEQ ID NOS:1-5,191 were corrected, the resultmg nucleotide 

55 sequences would still be at least 95% identical, nearly all would be at least 99% identical, and the great mapnty would 
be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191. 

As discussed elsewhere hererin, polynucleotides of the present invention readily may be obtained by routine ap- 
plication of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obta.mng 
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libraries and for sequencing are provided below, for instance. A wide variety of Staphylococcus aureus strains that can 
be used to prepare S aureus genomic DNA for cloning and for obtaining polynucleotides of the present invention are 
available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC). 
The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat. How- 

5 ever, the nucleotide sequences of the genomes of all Staphylococcus aureus strains will be at least 95% identical, in 
corresponding part, to the nucleotide sequences provided in SEQ ID NOS:1-5,191. Nearly all will be at least 99% 
identical and the great majority will be 99.9% identical. 

Thus, the present invention further provides nucleotide sequences which are at least 95%, preferably 99% and 
most preferably 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191, in a form which can be readily 

w used, analyzed and interpreted by the skilled artisan. 

Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99.9% identical 
to the nucleotide sequences of SEQ I D NOS: 1 -5, 1 91 are routine and readily available to the skilled artisan. For example, 
the well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85: 2444 (1988) can be 
used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate 

15 an identity score of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS:1-5,191, a representative fragment thereof, or a nucleotide 

20 sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide se- 
quence of SEQ ID NOS:1-5,191 may be "provided" in a variety of mediums to facilitate use thereof. As used herein, 
^provided" refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence 
of the present invention; i.e., a nucleotide sequence provided in SEQ ID NOS:1-5,191, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical 

2S to a polynucleotide of SEQ I D NOS: 1 -5, 1 91 . Such a manufacture provides a large portion of the Staphylococcus aureus 
genome and parts thereof (e.g., a Staphylococcus aureus open reading frame (ORF)) in a form which allows a skilled 
artisan to examine the manufacture using means not directly applicable to examining the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 

30 readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed 
directly by a computer Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard 
disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily 
. appreciate how any of the presently known computer readable mediums can be used to create a manufacture com- 

35 prising computer readable medium having recorded thereon a nucleotide sequence of the present invention. Likewise, 
it will be clear to those of skill how additional computer readable media that may be developed also can be used to 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled 
artisan can readily adopt any of the presently know methods for recording information on computer readable medium 

40 to generate manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 

45 readable medium. The sequence information can be represented in a word processing text file, formatted in commer- 
cially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data-processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention. 

so Computer software is publicly available which allows a skilled artisan to access sequence information provided in 

a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEQ ID 
NOS: 1-5, 191 , a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and 
most preferably at least 99.9% identical to a sequence of SEQ ID NOS:1-5,191 the present invention enables the 
skilled artisan routinely to access the provided sequence information for a wide variety of purposes. 

55 The examples which follow demonstrate how software which implements the BLAST (Altschul et al. t J. Mol. Biol. 

215:403410 (1990)) and BLAZE (Brutlag et ai, Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Staphylococcus aureus genome which contain 
homology to ORFs or proteins from both Staphylococcus aureus and from other organisms. Among the ORFs discussed 
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herein are protein encoding fragments of the Staphylococcus aureus genome useful in producing commercially impor- 
tant proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites. 

The present invention further provides systems, particularly computer-based systems, which contain the sequence 
information described herein. Such systems are designed to identify, among other things, commercially important frag- 

s ments of the Staphylococcus aureus genome. 

As used herein, "a computer-based system" refers to the hardware means, software means, and data storage 
means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means 
of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, 
output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available 

10 computer-based system are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention comprise a data storage means having 
stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means 
for supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the 

15 present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented on the computer- based 
system to compare a target sequence or target structural motif with the sequence information stored within the data 
storage means. Search means are used to identify fragments or regions of the present genomic sequences which 

20 match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the computer-based systems 
of the present invention. Examples of such software includes, but is not limited to, MacPattem (EMBL), BLASTN and 
BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present computer-based systems. . 

25 As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 

or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 1 0 to 1 00 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in gene expression and 

30 protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combi- 
nation of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed 
upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, 
but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited 

35 to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences). 

A variety of structural formats for the input and output means can be used to input and output the information in 
the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the 
Staphylococcus aureus genomic sequences possessing varying degrees of homology to the target sequence or target 
motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the 

40 target sequence or target motif and identifies the degree of homology contained in the identified fragment. 

A variety of comparing means can be used to compare a target sequence or target motif with the data storage 
means to identify sequence fragments of the Staphylococcus aureus genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in Altschui et ai, J. Mol. Biol. 215: 403-410 
(1990), was used to identify open reading frames within the Staphylococcus aureus genome. A skilled artisan can 

4$ readily recognize that any one of the publicly available homology search programs can be used as the search means 
for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known 
to those of skill also may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present 
invention. The computer system 1 02 includes a processor 1 06 connected to a bus 1 04. Also connected to the bus 1 04 

so are a main memory 1 08 (preferably implemented as random access memory, RAM) and a variety of secondary storage ■ 
devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage 
device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable 
storage medium 1 1 6 (such as a floppy disk, a compact disk, a magnetic tape, eta) containing control logic and/or data 
recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes 

55 appropriate software for reading the control logic and/or the data from the removable medium storage device 114, once 
it is inserted into the removable medium storage device 114. 

A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, 
any of the secondary storage devices 110, and/or a removable storage medium 116. During execution, software for 
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accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 
108, in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs. 

5 BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to isolated fragments. The fragments of the Staphylococcus aureus genome of the present invention include, 
but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which 
10 modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 
fragments (DFs). 

As used herein, an 'isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome" 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification 
'5 means to reduce, from the composition, the number of compounds which are normally associated with the composition. 
Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS:1-5,191, to 
representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and 
especially preferably at least 99.9% identical in sequence thereto, also as set out above. 

A variety of purification means can be used to generated the isolated fragments of the present invention. These 
20 include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment, Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5-20 kb 
in length. These fragments can then be used to generate an Staphylococcus aureus library by inserting them into 
lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5,191. Well 
25 known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA library of 
Staphylococcus aureus genomic DNA. Thus, given the availability of SEQ ID NOS:1-5,191, the information in Tables 
1, 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS:1-5,191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or 
other nucleic acid fragment of the present invention. 
30 The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 

double stranded DNA, and single stranded RNA. 

As used herein, an "open reading frame," ORF, means a series of triplets coding for amino acids without any 
termination codons and is a sequence translatable into protein. 

Tables 1 , 2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 
35 identified as putative coding regions by the GeneMark software using organism-specific second-order Markov proba- 
bility transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective lists. 

Table 1 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are at least 80 amino 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
40 an S. aureus nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table 1 and 
match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through Genbank by Sep- 
tember 1996. 

Table 3 sets out ORFs in the Staphylococcus aureus contigs of the present invention that do not match significantly, 
45 by BLASTP analysis, a polypeptide sequence available through Genbank by September 1996. 

In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number 
within the contig; the third column indicates the reading frame, taking the first 5' nucleotide of the contig as the start of 
the +1 frame; the fourth column indicates the first nucleotide of the ORF, counting from the 5' end of the contig strand; 
and the fifth column indicates the length of each ORF in nucleotides. 
so in Tables 1 and 2, column six, lists the Reference" for the closest matching sequence available through Genbank. 

These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the numenclature are available from the National Center for Biotech- 
nology Information. Column seven in Tables 1 and 2 provides the gene name" of the matching sequence; column eight 
provides the BLAST identity" score from the comparison of the ORF and the homologous gene; and column nine 
55 indicates the length in nucleotides of the highest scoring segment pair" identified by the BLAST identity analysis. 

In Table 3, the last column, column six, indicates the length of each ORF in amino acid residues. 

The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. 
For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 
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1 , 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have 
a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not identical, were "similar" 
(i.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence 
similarity, such as fasta and BLAST specifically list per cent identity of a matching region as an output parameter Thus, 
5 for instance, Tables 1 and 2 herein enumerate the per cent identity" of the highest scoring segment pair" in each ORF 
and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations provided below. 

It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the 
types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 
10 ' artisan can readily identify ORFs in contigs of the Staphylococcus aureus genome other'than those listed in Tables 
1-3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF; means a series of nucleotide molecules which mod- 
ulates the expression of an operably linked ORF or EMF. 
is As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the ex- 

pression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression 
or an operably linked ORF in response to a specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Staphylococcus aureus genome by their proximity to 
20 the ORFs provided in Tables 1-3. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 
200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably 
linked ORF in a fashion similar to that found with the naturally linked ORF sequence. As used herein, an "intergenic 
segment" refers to fragments of the Staphylococcus aureus genome which are between two ORF(s) herein described. 
EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems 
25 of the present invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a 
cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic 
resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap 
vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the 
30 expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided 
below. 

A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction 
sites upstream from the marker sequence in the EMF trap vector. The vector is then transformed into an appropriate 
host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. 
35 As described above, an EMF will modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize 
to Staphylococcus aureus sequences. DFs can be readily identified by identifying unique sequences within contigs of 
the Staphylococcus aureus genome, such as by using well-known computer analysis software, and by generating and 
testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which de- 
40 termines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not limited to the specific sequences herein 
described, but also include allelic and species variations thereof. Allelic and species variations can be routinely deter- 
mined by comparing the sequences provided in SEQ ID NOS:1 -5, 1 91 , a representative fragment thereof, or a nucleotide 
sequence at least 95%, preferably 99% and most preferably 99.9% identical to SEQ ID NOS:1 -5, 1 91 , with a sequence 
45 from another isolate of the same species. 

Furthermore, to accomodate codon variability, the invention includes nucleic acid molecules coding for the same 
amino acid sequences as do the nucleic acid sequences mentioned above. In other words, in the coding region of an 
ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated. 

Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, 
so such as an ORF, in both directions (i.e., sequence both strands). Alternatively, error screening can be performed by 
- sequencing corresponding polynucleotides of Staphylococcus aureus origin isolated by using part or all of the fragments 
in question as a probe or primer. 

Each of the ORFs of the Staphylococcus aureus genome disclosed in Tables 1 , 2 and 3, and the EMFs found 5' 
to the ORFs, can be used as polynucleotide reagents in numerous ways. For example, the sequences can be used 
55 as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, 
particular Staphylococcus aureus. Especially preferred in this regard are ORF such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most likely to be highly selective for 
Staphylococcus aureus. Also particularly preferred are ORFs that can be used to distinguish between strains of Sta- 
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phyiococcus aureus, particularly those that distinguish medically important strain, such as drug-resistant strains. 

In addition, the fragments of the present invention, as broadly described, can be used to control gene expression 
through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynu- 
cleotide sequence to DNA or RNA. Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, 
s while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides. Polynu- 
cleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary 
to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known 
10 and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et al., Nuci Acids Res. 6: 
3073 (1979); Cooney etai., Science 241 : 456 (1988); and Dervan etai., Science 251: 1360 (1991). Antisense tech- 
niques in general are discussed in, for instance, Okano, J. Neurochem. 56: 560 (1991) and OLIGODEOXYNUCLE- 
OTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION, CRC Press, Boca Raton, FL (1988)). 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 
15 lococcus aureus genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of 
the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Staphylococcus 
aureus genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a pro- 
moter, operably linked to the ORF. For vectors comprising the EMFs of the present invention, the vector may further 
20 comprise a marker sequence or heterologous ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following vectors are provided by 
way of example. Useful bacterial vectors include phagescrtpt, PsiX174, pBluescript SK and KS (+ and -), pNH8a, 
pNH16a, pNH18a, pNH46a (available from Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available 
25 from Pharmacia). Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXT1 , pSG (available from Stratagene) 
pSVK3, pBPV, pMSG, pSVL (available from Pharmacia). 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial pro- 
moters include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV 
30 thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein- 1. Selection of the appropriate 
vector and promoter is well within the level of ordinary skill in the art. 

The present invention further provides host cells containing any one of the isolated fragments of the Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
35 eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such 
as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are described in, 
for instance, Davis, L. etai., BASIC METHODS IN MOLECULAR BIOLOGY (1986). 
40 A host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF. 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is 
45 intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by 
nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs depicted in Tables 2 and 3 which encode 
proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins 
50 of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. 
Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the 
55 polypeptide or protein. One skilled in the art can readily employ well-known methods for isolating polpeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain, 
or by other methods. Methods for isolation and purification that can be employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immu- 
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no-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified from cells which have been altered to 
express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide 
or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally 

s does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt pro- 
cedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells 
in order to generate a cell which produces one of the polypeptides or proteins of the present invention. 

Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, 
but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic 

10 host such as £ coli and B. subtilis. The most preferred cells are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level. 

"Recombinant," as used herein, means that a polypeptide or protein is derived from recombinant (e.g., microbial 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal (e.g., yeast) expression systems. As a product, "recombinant microbiardefines a polypeptide or protein essen- 

is tially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., £. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells. 

"Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. Generally, DNA segments encoding the 
polypeptides and proteins provided by this invention are assembled from fragments of the Staphylococcus aureus 

20 genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a mi- 
crobial or viral operon. 

"Recombinant expression vehicle or vector" refers to a plasmid or phage or virus or vector, for expressing a polypep- 
tide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly 

25 of (1) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate 
and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancers and a polyadenylation signal; (2) a structural or coding sequence 
which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the 
beginning of the desired coding region and terminate translation at its end. Structural units intended for use in yeast 

30 or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated 
protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, 
it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the 
expressed recombinant protein to provide a final product. 

"Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional 

35 unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokary- 
otic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or pro- 
teins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appro- 
priate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived 

40 from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described in Sambrook etai, MOLECULAR CLONING: A LABORATORY MANUAL, 2 nd Edi- 
tion, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby 
incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting 

45 transformation of the host cell, e.g., the ampicillin resistance gene of £ coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled 
in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable 

so of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally, the heterol- 
ogous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired charac- 
teristics, e.g., stabilization or simplified purification of expressed recombinant product 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a 
desired protein together with suitable translation initiation and termination signals in operable reading phase with a 

55 functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication 
to ensure maintenance of the vector and, when desirable, provide amplification within the host. 

Suitable prokaryotic hosts for transformation include strains of Staphylococcus aureus, £ coli, B. subtilis, Salmo- 
nella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus. Others 
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may, also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable 
marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
(available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, 
Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence 
to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter, where it is inducible, is derepressed or induced by appropriate means (e.g., temperature shift or 
chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. 
Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract is retained for further purification. 

Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mam- 
malian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Ce//23: 175 
(1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and 
BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also 
any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking nont ran scribed sequences. DNA sequences derived from the SV40 viral genome, for ex- 
ample, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 
cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as neces- 
sary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 

An additional aspect of the invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
diagnostic antigens and/or immunoprotective vaccines, collectively "immunologically useful polypeptides'. Such im- 
munologically useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
in the art and described elsewhere herein. The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides: 

As is known in the art, an amino terminal type I signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell. Such outermembrane polypeptides are expected to be immuno- 
logically useful. According to Izard, J. W. et al., Mol. Microbiol. 13, 765-773; (1994), polypeptides containing type I 
signal sequences contain the following physical attributes: The length of the type I signal sequence is approximately 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with small side-chain amino acids in the -1 
and -3 positions. 

Also known in the art is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above. Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M. S. and Lory, S., J. Bac- 
terid. 174, 7345-7351; 1992)). These are typically six to eight amino acids with a net basic charge followed by an 
additional sixteen to thirty primarily hydrophobic residues. The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus. In addition, all type I V signal sequences contain 
a phenylalanine residue at the +1 site relative to the cleavage site. 

Studies of the cleavage sites of twenty-six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino acid sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial lipoprotein precursors examined 
contained the sequence L-(A,S)-(G, A)-C at positions -3 to +1 , relative to the point of cleavage (Hayashi, S. and Wu, 
H. C. Lipoproteins in bacteria. J Bioenerg. Biomembr. 22, 451-471; 1990). 

It well known that most anchored proteins found on the surface of gram-positive bacteria possess a highly con- 
served carboxy terminal sequence. More than fifty such proteins from organisms such as S. pyogenes, S. mutans, E. 
faecalis, S. pneumoniae, and others, have been identified based on their extracellular location and carboxy terminal 
amino acid sequence (Fischetti, V. A. Gram-positive commensal bacteria deliver antigens to elicit mucosal and systemic 
immunity. ASM News 62, 405410; 1 996). The conserved region is comprised of six charged amino acids at the extreme 
carboxy terminus coupled to 15-20 hydrophobic amino acids presumed to function as a transmembrane domain. Im- 
mediately adjacent to the transmembrane domain is a six amino acid sequence conserved in nearly all proteins ex- 
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amined. The amino acid sequence of this region is L-P-X-T-G-X, where X is any amino acid. 

Amino acid sequence similarities to proteins of known function by BLAST enables the assignment of putative 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall. Such proteins are well known in the art and include "lipoprotein", "periplasmic", or "antigen". 

An algorithm for selecting antigenic and immunogenic Staphylococcus aureus polypeptides including the foregoing 
criteria was developed by the present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection of several ORFs which are predicted to be outermem- 
brane-associated proteins. These proteins are identified in Table 4, below, and shown in the Sequence Listing as SEQ 
I D NOS:5, 1 92 to 5,255. Thus the amino acid sequence of each of several anWgenlcStaphylococcus aureus polypeptides 
listed in Table 4 can be determined, for example, by locating the amino acid sequence of the ORF in the Sequence 
Listing. Likewise the polynucleotide sequence encoding each ORF can be found by locating the corresponding poly- 
nucleotide SEQ ID in Tables 1 , 2, or 3, and finding the corresponding nucleotide sequence in the sequence listing. 

As will be appreciated by those of ordinary skill in the art, although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vivo, it is not always technically practical to express a complete ORF 
in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods. 
As a result, the immunologically useful polypeptides described herein as SEQ ID NOS:5, 192-5,255 may have been 
modified slightly to simplify the production of recombinant protein, and are the preferred embodiments. In general, 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the amino terminal signal 
sequence, are excluded for enhanced in vitro expression of the polypeptides. Furthermore, any highly hydrophobic 
amino acid sequences occurring at the carboxy terminus are also excluded. Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified in Table 4, and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5,1 92-5,255, may obtain the complete predicted amino 
acid sequence of each polypeptide by translating the corresponding polynucleotides sequences of the corresponding 
ORF listed in Tables 1 ,2 and 3 and found in the sequence listing. 

Accordingly, polypeptides comprising the complete amino acid of an immunologically useful polypeptide selected 
from the group of polypeptides encoded by the ORFs identified in Table 4, or an amino acid sequence at least 95% 
identical thereto, preferably at least 97% identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence.selected from the group of 
amino acid sequences shown in the sequence listing as SEQ I D NOS:5,1 91 -5,255, or an amino acid sequence at least 
95% identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto, form 
an embodiment of the invention. Polynucleotides encoding the foregoing polypeptides also form part of the present 
invention. 

In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a 
polypeptide of the invention, particularly those epitope-bearing portions (antigenic regions) identified in Table 4. The 
epitope-bearing portion is an immunogenic or antigenic epitope of a polypeptide of the invention. An "immunogenic 
epitope" is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen. 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope." 
The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, for 
instance, Geysen et al., Proc. Natl. Acad. Sci. USA 81:3998- 4002 (1983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e., that contain a region of a protein 
molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. 
See, for instance, Sutctiffe, J. G., Shinnick, T M., Green, N. and Learner, R. A. (1983) "Antibodies that react with 
predetermined sites on proteins", Science, 219:660-666. Peptides capable of eliciting protein-reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention. See, for instance, 
Wilson et al., Cell 37:767-778 (1984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 to about 30 amino acids contained within 
the amino acid sequence of a polypeptide of the invention. Non-limiting examples of antigenic polypeptides or peptides 
that can be used to generate S. aureus specific antibodies include: a polypeptide comprising peptides shown in Table 
4 below. These polypeptide fragments have been determined to bear antigenic epitopes of indicated S. aureus proteins 
by the analysis of the Jameson-Wolf antigenic index, a representative sample of which is shown in Figure 3. 

The epitope-bearing peptides and polypeptides of the invention may be produced by any conventional means. 
See, e.g., Houghten, R. A. (1985) General method for the rapid solid-phase synthesis of large numbers of peptides: 
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specificity of antigen-antibody interaction at the level of individual amino acids. Proc. Natl. Acad. Sci. USA 82: 
5131-5135; this "Simultaneous Multiple Peptide Synthesis (SMPS)" process is further described in U.S. Patent No. 
4,631,211 to Houghten et al. (1986). Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art. See, for instance, Sutcliffe et al., supra; Wilson et al., supra; 
Chow, m! et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et al., J. Gen. Virol. 66:2347-2354 (1985). 

Immunogenic epitope-bearing peptides of the invention, i.e., those parts of a protein that elicit an antibody response 
when the whole protein is the immunogen, are identified according to methods known in the art. See, for instance, 
Geysen et al., supra. Further still, U.S. Patent No. 5,1 94,392 to Geysen (1 990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the 
epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of 
interest. More generally, U.S. Patent No. 4,433,092 to Geysen (1989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding 
site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al. (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear C1 -C7-alkyl peralkylated oligopeptides and sets and libraries of such 
peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
alkylated oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these methods. 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins, as is described above. Also listed are epitopes or "antigenic regions" of each of the 
identified polypeptides. The antigenic regions, or epitopes, are delineated by two numbers x-y, where x is the number 
of the first amino acid in the open reading frame included within the epitope and y is the number of the last amino acid 
in the open reading frame included within the epitope. For example, the first epitope in ORF 168-6 is comprised of 
amino acids 36 to 45 of SEQ ID NO:5,192, as is described in Table 4. The inventors have identified, several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 
polypeptides comprising an amino acid sequence of one or more antigenic regions identified in Table 4. The invention 
further provides polynucleotides encoding such polypeptides. 

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are sub- 
stantially equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid 
and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be- 
tween reference and subject sequences. For purposes of the present invention, sequences having equivalent biological 
activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 
fragments of the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. As used herein, a sequence or protein of Staphylococcus aureus is defined as a 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if it shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the 
sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony^laque hybrid- 
ization, one skilled in the art can obtain homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than B5% sequence (amino acid or nucleic acid) homology. Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among 
especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology. The most preferred homologs among these are those with 99.9% homology or more. It will be understood 
that, among measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1 -5,191 or from 
a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
ID NOS:1-5,191 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing 
cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have 
been described in great detail in many publications such as, for example, Innis era/., PCR PROTOCOLS, Academic 
Press, San Diego, CA (1990)). 

When using primers derived from SEQ ID NOS:1-5,191 or from a nucleotide sequence having an aforementioned 
identity to a sequence of SEQ ID NOS:1-5,191, one skilled in the art will recognize that by employing high stringency 
conditions {e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC) only 
sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency 



15 



EP0 786 519 A2 



conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X SSPC), 
sequences which are greater than 40-50% homologous to the primer will also be amplified. 

When using DNA probes derived from SEQ ID NOS:1 -5,1 91 , or from a nucleotide sequence having an aforemen- 
tioned identity to a sequence of SEQ ID NOS:1 -5,191 , for colony/plaque hybridization, one skilled in the art will recog- 

5 nize that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and 
washing at 50- 65°C in 0.5X SSPC), sequences having regions which are greater than 90% homologous to the probe 
can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 
40-45% formamide, and washing at 42°C in 0.5X SSPC), sequences having regions which are greater than 35-45% 
homologous to the probe will be obtained. 

10 Any organism can be used as the source for homologs of the present invention so long as the organism naturally 

expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs 
are bacterias which are closely related to Staphylococcus aureus. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

15 

Each ORF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide. 
As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and 
industrial purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one 
skilled in the art to use the Staphylococcus aureus ORFs in a manner similar to the known type of sequences for which 

20 the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A 
variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial 
use of enzymes, for example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., Mac- 
millan Publications, Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper era/., Eds., Elsevier 
Science Publishers, Amsterdam, The Netherlands (1985). A variety of exemplary uses that illustrate this and similar 

25 aspects of the present invention are discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and 

30 macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes en- 
zymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary 
metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes 
involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in 
amino acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, 

35 can be used for industrial biosynthesis. 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS:1-5,191. 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase. 

40 Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a 

number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification 
and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to 
give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts 
et al, SymbiosisTV. 79 (1986) and Voragen etaf. in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY, Whitak- 

45 er etal., Eds., American Chemical Society Symposium Series 389 : 93 (1989) . 

The metabolism of sugars is an important aspect of the primary metabolism of Staphylococcus aureus. Enzymes 
involved in the degradation of sugars, such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint, include 
sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 

50 oxidases which produces ketogu Ionic acid (KG A). KG A is an intermediate in the commercial production of ascorbic 
acid using the Reichstein's procedure, as described in Krueger etal., Biotechnolocj y 6[A) t Rhine etal., Eds., Verlag 
Press, Weinheim, Germany (1984). 

Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized 
form for the deoxygenation of beer. See, for instance, Hartmeir etal., Biotechnology Letters V. 21 (1979). The most 

ss important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are 
used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, 
for example, in Bigelis et al, beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett etal., Eds., 
Academic Press, New York (1985). In addition to industrial applications, GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and 
cellulose hydrosylates. This application is described in Owusu et al., Biochem. et Biophysics. Acta. 872: 83 (1 986), for 
instance. 

The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field 

s of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today. Initially, soluble 
enzymes were used and later immobilized enzymes were developed (Krueger et ah, Biotechnology, The Textbook of 
Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988). 

10 Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the 

largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a 
large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See 
Faultman et at., Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and 
Godfrey et al, Industrial Enzymes, MacMillan Publishers, Surrey, UK (1 983) and Hepner et al, Report Industrial En- 

15 zymes by 1 990, Hel Hepner & Associates, London (1 986)). 

Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for 
instance, Macrae etai, Philosophical Transactions of the Chiral Society of London 310:227 (1985) and Poserke, Jour- 
nal of the American Oil Chemist Society 61:1758 (1984). A major use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application 

20 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex 
organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral interme- 
diates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

2S involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies etai, Re- 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)). 
The following reactions catalyzed by enzymes are of interest to organic chemists:hydrolysis of carboxylic acid esters, 
phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, re- 
duction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, 

30 and carbon bond forming reactions such as the aldol reaction. 

When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an 
isolated partially purified enzyme on the other hand, has been described in detail by Bud et ai, Chemistry in Britain 

35 (1987), p. 127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic 
production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic 
rates. A description of the use of amino transferases for amino acid production is provided by Roselle-David, Methods 
40 of EnzymoloQV 136:479 (1 987). 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination. A variety of commercially important enzymes have previously been 
isolated from members of Staphylococcus aureus. These include Sau3A and Sau96l. 

45 2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be 
50 either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 
hybridomas which produce these antibodies. A hybridoma is an immortalized cell line which is capable of secreting a 
specific monoclonal antibody. 

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of pro- 
55 ducing the desired antibody are well known in the art (Campbell, A. M., MONOCLONAL ANTIBODY TECHNOLOGY: 
LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, Elsevier Science Publishers, Am- 
sterdam, The Netherlands (1984); St. Groth etai, J. Immunol. Methods 35: 1-21 (1980), Kohler and Milstein, Nature 
256 : 495-497 (1 975)), the trioma technique, the human B- cell hybridoma technique (Kozbor et al., Immunology Today 
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4: 72 (1983), pgs. 77-96 of Cole etai, in MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc. 
(1985)). 

Any animal (mouse, rabbit, etc.) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide. Methods for immunization are well known in the art. Such methods include subcutaneous or interperitoneal 
s injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection. 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase 
the protein's antigenicity. Methods of increasing the antigenicity of a protein are well known in the art and include, but 
10 are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces 
is an antibody with the desired characteristics. These include screening the hybridomas with an ELISA assay, western 
blot analysis, or radioimmunoassay (Lutz etal, Exp. Cell Res. 175: 109-124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedures 
known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1 984)). 
20 Techniques described for the production of single chain antibodies (U. S. Patent 4,946,778) can be adapted to 

produce single chain antibodies to proteins of the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for 
the presence of antibodies with the desired specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in detectably labelled form. Antibodies can 
2S be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, efc.), enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FIT C or rhodamine, etc.), 
paramagnetic atoms, etc. Procedures for accomplishing such labelling are well-known in the art, for example see 
Sternberger etai, J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. etal., Meth. Enzym. 62:308 (1979); Engval, 
E. etai, Immunol. 109:129 (1972); Goding, J. W. J. Immunol. Meth. 13:215 (1976)). 
. 30 The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify celts 

or tissues in which a fragment of the Staphylococcus aureus genome is expressed. 

The present invention further provides the above-described antibodies immobilized on a solid support. Examples 
of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
35 are well known in the art (Weir, D. M. et al, "Handbook of Experimental Immunology" 4th Ed., Blackwelt Scientific 
Publications, Oxford, England, Chapter 1 0 (1 986); Jacoby, W. D. etal, Meth. Enzym. 34 Academic Press, N. Y. (1 974)). 
The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for 
immunoaffinity purification of the proteins of the present invention. 

40 3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs,antigens or antibodies of the present invention. 
In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 

45 the DFs, or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample. 

Conditions for incubating a DF, antigen or antibody with a test sample vary. Incubation conditions depend on the 
format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used 
in the assay One skilled in the art will recognize that any one of the commonly available hybridization, amplification 

so or immunological assay formats can readily be adapted to employ the Dfs, antigens or antibodies of the present in- 
vention. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. etai, Techniques in 
Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry; PCT publication W095/32291, and 

55 Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1 985), all of which are hereby incorpo- 
rated herein by reference. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids 
such as sputum, blood, serum, plasma, or urine. The test sample used in the above-described method will vary based 
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on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. 
Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
which comprises:(a) a first container comprising one of the Dfs, antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting 
presence of a bound DR antigen or antibody. 

In detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such 
containers include small glass containers, plastic containers or strips of plastic or paper Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test sample, a container which 
contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody, antigen or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alterna- 
tive, if the primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody. One skilled in the art will readily recognize that the disclosed Dfs, antigens and antibodies of the 
present invention can be readily incorporated into one of the established kit formats which are well known in the art. 

4. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Staphylococcus aureus fragment and contigs herein described. 

In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated 
fragment of the Staphylococcus aureus genome; and 

(b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally 
selected or designed" when the agent is chosen based on the configuration of the particular protein. For example, one 
skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby et al, Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W. H. Freeman, NY (1 992), pp. 289-307, and Kaspczak era/., Biochemistry 28:9230-8 (1 989), or pharmaceutical 
agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 
control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, 
such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. 

One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 
binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can 
be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary 
to a region of the gene involved in transcription (triple helix - see Lee et al, Nucl. Acids Res. 6:3073 (1979); Cooney 
et al, Science 241 :456 (1 988); and Dervan et al, Science 251 : 1 360 (1 991 )) or to the mRNA itself (antisense - Okano, 
J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1 988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisense 
RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated 
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to be effective in model systems. Information contained in the sequences of the present invention can be used to design 
antisense and triple helix-forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccines 

The present invention further provides pharmaceutical agents which can be used to modulate the growth or path- 
ogenicity of Staphylococcus aureus, or another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharma- 
ceutical compositions. As used herein, the "pharmaceutical agents of the present invention" refers the pharmaceutical 
agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are 
identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organism, in vivo or in vitro, " when the agent reduces the rate of growth, rate of division, or viability of the 
organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity 
of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth or path- 
ogenicity by binding to an important protein thus blocking the biological activity of the protein, while other agents may 
bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs 
of the present invention and serve as a vaccine. The development and use of vaccines derived from membrane asso- 
ciated polypeptides are well known in the art. The inventors have identified particularly preferred immunogenic Sta- 
phylococcus aureus polypeptides for use as vaccines. Such immunogenic polypeptides are described above and sum- 
marized in Table 4, below. 

As used herein, a "related organism" is a broad term which refers to any organism whose growth or pathogenicity 
can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will 
contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
ner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal 
routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight 
and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most 
cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of ad- 
ministration, symptoms, etc. 

The agents of the present invention can be used in native form or can be modified to form a chemical derivative. 
As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological 
half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, 
REMINGTON'S PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein. 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox or thermal stability, biological half- 
life, hydrophobic ity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers 
also may be effected in this way and can be assayed by methods well known to the skilled artisan. 

The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient 
by any suitable means {e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenteral ly). It is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the 
preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single 
or multiple injections. 

In providing a patient with one of the agents of the present invention, the dosage of the administered agent will 
vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 
1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered. The therapeu- 
tically effective dose can be lowered by using combinations of the agents of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be administered "in combination" with each other 
when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can 
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be measured at the same time. The composition of the present invention can be administered concurrently with, prior 
to, or following the administration of the other agent. 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism. 

The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose. 
When provided prophylactically, the agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of 
any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 
toms of the infection and to increase the rate of recovery. 

The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharma- 
ceutically acceptable form and in a therapeutically effective concentration. A composition is said to be "pharmacolog- 
ically acceptable" if its administration can be tolerated by a recipient patient. Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiolog- 
ically significant if its presence results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known methods to prepare pharmaceutical^ 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutical^ acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, a 
g., human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16 th Ed., 
Osol, A., Ed., Mack Publishing, Easton PA (1 980). In order to form a pharmaceutical^ acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invention, together with a suitable amount of carrier vehicle. 

Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. 
The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcel- 
lulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of release. Another possible method to control the duration of action by controlled release preparations is 
to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino 
acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by 
coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine -micro- 
capsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, 
liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such tech : 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980). 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the invention. Associated with such containers) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 
or biological products, which notice reflects approval by the agency of manufacture, use or sale for human adminis- 
tration. 

In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds. 
6. Shot-Gun Approach to Megabase DNA Sequencing 

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols. 

Certain aspects of the present invention are described in greater detail in the examples that follow. The examples 
are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present disclosure. 
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ILLUSTRATIVE EXAMPLES 
LIBRARIES AND SEQUENCING 

1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
(Landerman and Waterman, Genomics 2: 231 (1 988)) application of the equation for the Poisson distribution. According 
to this treatment, the probability, P 0 , that any given base in a sequence of size L, in nucleotides, is not sequenced after 
a certain amount, n, in nucleotides, of random sequence has been determined can be calculated by the equation P 0 
= e - ™ where m is L/n, the fold coverage. 0 For instance, for a genome of 2.8 Mb, m=1 when 2.8 Mb of sequence has 
been randomly generated (1X coverage). At that point, P 0 = e' 1 = 0.37. The probability that any given base has not 
been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, 
therefore, is equivilent to the fraction of the whole sequence that has yet to be determined. Thus, at one-fold coverage, 
approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced. When 14 Mb of sequence 
has been generated, coverage is 5X for a .2.8 Mb and the unsequenced fraction drops to .0067 or 0.67%. 5X coverage 
of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with 
an average sequence read length of 410 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le -m , and the average gap size, g, follows the 
equation, g - L/n. Thus, 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a poly- 
nucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 (1988). 

2. Random Library Construction 

In order to approximate the random model described above during actual sequencing, a nearly ideal library of 
cloned genomic fragments is required. The following library construction procedure was developed to achieve this end. 

Staphylococcus aureus DNA was prepared by phenol extraction. A mixture containing 600 ug DNA in 3.3 ml of 
300 mM sodium acetate, 10 mM Tris-HCI, 1 mM Na-EDTA, 30% glycerol was sonicated for 1 min. at 0°C in a Branson 
Model 450 Sonicator at the lowest energy setting using a 3 mm probe. The sonicated DNA was ethanol precipitated 
and redissolved in 500 ul TE buffer. 

To create blunt-ends, a 1 00 ul aliquot of the resuspended DNA was digested with 5 units of BAL31 nuclease (New 
England BioLabs) for 10 min at 30°C in 200 ul BAL31 buffer . The digested DNA was phenol-extracted, ethanol-pre- 
cipitated, redissolved in 100 ul TE buffer, and then size-fractionated by electrophoresis through a 1.0% low melting 
temperature agarose gel. The section containing DNA fragments 1.6-2.0 kb in size was excised from the gel, and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA. 
DNA was ethanol precipitated and redissolved in 20 ul of TE buffer for ligation to vector. 

A two-step ligation procedure was used to produce a plasmid library with 97% inserts, of which >99% were single 
inserts. The first ligation mixture (50 ul) contained 2 ug of DNA fragments, 2 ug pUC1 8 DNA (Pharmacia) cut with Smal 
and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and was incubated 
at 14°C for 4 hr. The ligation mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was dissolved in 20 ul TE buffer and eiectrophoresed on a 1.0% low melting agarose gel. Discrete bands in a ladder 
were visualized by ethidium bromide-staining and UV illumination and identified by size as insert (i), vector (v), v+i, 
v+2i, v+3i, etc. The portion of the gel containing v+i DNA was excised and the v+i DNA was recovered and resuspended 
into 20 ul TE. The v+i DNA then was blunt-ended by T4 polymerase treatment for 5 min. at 37° C in a reaction mixture 
(50 ul) containing the v+i linears, 500 uM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), 
under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+i linears were 
dissolved in 20 ul TE. The final ligation to produce circles was carried out in a 50 ul reaction containing 5 ul of v+i 
linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the following day, the reaction mixture was 
stored at -20°C. 

This two-stage procedure resulted in a molecularly random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E.coti host cells deficient in all 
recombination and restriction functions (A. Greener, Strategies 3 (1):5 (1990)) were used to prevent rearrangements, 
deletions, and loss of clones by restriction. Furthermore, transformed cells were plated directly on antibiotic diffusion 
plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells. 

Plating was carried out as follows. A 100 ul aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 ul aliquot of 1.42 M beta- 
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mercaptoethanol was added to the aliquot of cells to a final concentration of 25 mM. Cells were incubated on ice for 
10 min. A 1 ul aliquot of the final ligation was added to the cells and incubated on ice for 30 min. The cells were heat 
pulsed for 30 sec. at 42° C and placed back on ice for 2 min. The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation 

5 mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 
20 g tryptone, 5 g yeast extract, 0.5 g NaCI. 1 .5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented 
with 0.4 ml of 50 mg/ml ampicillin per 1 00 ml SOB agar. The 1 5 ml top layer of SOB agar is supplemented with 1 ml 
X-Gal(2%), 1 mIMgClgO M), and 1 ml MgSO 4 /100 ml SOB agar. The 15 ml top layer was poured just prior to plating. 
Our titer was approximately 100 colonies/10 ul aliquot of transformation. 

10 All colonies were picked for template preparation regardless of size. Thus, only clones lost due to "poison" DNA 

• or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected. 

3. Random DNA Sequencing 

15 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with 5Prime -> 3Prime Inc. (Boulder, CO). Plasmid preparation was performed in a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification. Average template concentration was 
determined by running 25% of the samples on an agarose gel. DNA concentrations were not adjusted. 

20 Templates were also prepared from a Staphylococcus aureus lambda genomic library. An unamplified library was 

constructed in Lambda DASH 1 1 vector (Stratagene). Staphylococcus aureus DNA (> 1 00 kb) was partially digested in 
a reaction mixture (200 ul) containing 50 ug DNA, 1X Sau3AI buffer, 20 units Sau3AI for 6 min. at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient. Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation . One ul of fragments was used with 1 ul of DASHII vector (Stratagene) in 

25 the recommended ligation reaction. One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage were plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM buffer and chloroform treatment). Yield was about 
2.5x1 0 9 pfu/ul. 

An amplified library was prepared from the primary packaging mixture according to the manufacturer's protocol. 

30 The amplified library is stored frozen in 7% dimethylsulf oxide. The phage titer is approximately 1x10 s pfu/ml. 

Mini-liquid lysates (0.1 ul) are prepared from randomly selected plaques and template is prepared by long range 
PCR. Samples are PCR amplified using modified T3 and T7 primers, and Elongase Supermix (LTl). 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and Hamilton Microlab 2200) and the Perkin-Elmer 9600 thermocycler with Applied Biosystems PRISM Ready 

35 Reaction Dye Primer Cycle Sequencing Kits for the M1 3 forward (M1 3-21 ) and the M1 3 reverse (M1 3RP1 ) primers. 
Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers. All of the dye terminator sequencing reactions are analyzed 

40 using the 2X 9 hour module on the AB 377. Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for M13-21 and M13RP1 
sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for M13-21 sequences, 
445bp for M1 3RP1 sequences, and 375 bp for dye-terminator reactions. 

45 4. Protocol for Automated Cycle Sequencing > 

The sequencing was carried out using Hamilton Microstation 2200, Perkin Elmer 9600 thermocyclers, ABI 373 
and ABI 377 Automated DNA Sequencers. The Hamilton combines pre-aliquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescent ty- label led sequencing 
so primers, and reaction buffer. Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler. Thirty consecutive cycles of linear amplification (i.e.., one 
primer synthesis) steps were performed including denaturation, annealing of primer and template, and extension; i.e., 
DNA synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for 
an oil overlay. 

S5 Two sequencing protocols were used: one for dye-labelled primers and a second for dye-labelled dideoxy chain 

terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer was labelled with a different fluorescent dye, permitting the four individual 
reactions to be combined into one lane of the 373 or 377 DNA Sequencer for electrophoresis, detection, and base- 



23 



EP0 786 519 A2 



calling. ABI currently supplies premixed reaction mixes in bulk packages containing all the necessary non-template 
reagents for sequencing. Sequencing can be done with both plasmid and PCR-generated templates with both dye- 
primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable 
sequences. 

Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI 377 
per day. Electrophoresis was run overnight (ABI 373) or for 2 1/2 hours (ABI 377) following the manufacturer's protocols. 
Following electrophoresis and fluorescence detection, the ABI 373 or ABI 377 performs automatic lane tracking and 
base-calling. The lane-tracking was confirmed visually. Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality. Trailing sequences of low quality were removed and the sequence 
itself was loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector polylinker sequence 
was removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. 

INFORMATICS 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed. (For review 
see, for instance, Kerlavage et al } Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System 
Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whereever possible and to reduce user error. The database stores and correlates all information 
collected during the entire operation from template preparation to final analysis of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 
on a Unix platform, it was necessary to design and implement a variety of multi- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort. 

2. Assembly 

An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence 
fragments was enployed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments 
of the genome. In order to obtain the speed necessary to assemble more than 10 4 fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The 
number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements.' 
Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a 
modified version of the Smith-Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., 
Methods in Enzymology}§4: 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched 
end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of align- 
ments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information 
coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library). 

3. Identifying Genes 

The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorf, 
which finds ORFs of a minimum length. The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92.0), using the BLASTN search 
method to identify overlaps of 50 or more nucleotides with at least a 95% identity Those ORFs with nucleotide sequence 
matches are shown in Table 1 . The ORFs without such matches were translated to protein sequences and and com- 
pared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept 
databases. ORFs of at least 80 amino acids that matched a database protein with BLASTP probability less than or 
equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases. 
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ORFs of at least 1 20 amino acids that did not match protein or nucleotide sequences in the databases at these levels 
are shown in Table 3. 

ILLUSTRATIVE APPLICATIONS 

5 

1. Production of an Antibody to a Staphylococcus aureus Protein 

Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the 
methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as 
10 E. coli, or can by chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by 
concentration on an Amicon filter device, to the level of a few microg rams/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows. 

2. Monoclonal Antibody Production by Hybridoma Fusion 

15 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from 
murine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or modifi- 
cations of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 
20 The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused 
cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. 
Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol. 70:419 (1980), and modified metri- 
cs ods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. 
Detailed procedures for monoclonal antibody production are described in Davis, L. et al Basic Methods in Molecular 
Biology Elsevier, New York. Section 21-2 (1989). 

3. Polyclonal Antibody Production by Immunization 

30 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenicity Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species. For example, small molecules tend to be less immunogenic than other and may require the use of 

35 carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of anttgenadministered at multiple 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, 
J. era/., J. Clin. Endocrinol. Metab. 33:988-991 (1971), 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as de- 

40 termined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall. See, for example, Ouchterlony, O. era/., Chap. 19 in:Handbook of Experimental Immunology, 
Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0. 1 to 0. 2 mgAnl of serum 
(about 12M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
for example, by Fisher, D., Chap. 42 in:Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. 

45 Soc. For Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively 
or qualitatively to identify the presence of antigen in a biological sample. In addition, they are useful in various animal 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 

so the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapeutic 
reagent. 

3. Preparation of PCR Primers and Amplification of DNA 

55 Various fragments of the Staphylococcus aureus genome, such as those of Tables 1-3 and SEQ ID NOS:1 -5,1 91 

can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers 
are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, 
it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approxi- 
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mately the same. The PCR primers and amplified DNA of this Example find use in the Examples that follow. 
4. Gene expression from DNA Sequences Corresponding to ORFs 

A fragment of the Staphylococcus aureus genome provided in Tables 1 -3 is introduced into an expression vector 
using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially avail- 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to enhance expression and facilitate 
proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular ex- 
pression organism, as explained by Hatfield etai., U. S. Patent No. 5,062,767, incorporated herein by this reference. 

The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment. Bacterial ORFs generally lack a poly A addition signal. The addition signal sequence 
can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using 
Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Strat- 
agene) for use in eukaryotic expression systems. pXT1 contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus. The positions of theLTRs in the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Staphylococcus aureus DNA 
is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 
aureus DNA and containing restriction endonuclease sequences for Pstl incorporated into the 5* primer and Bglll at 
the 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from 
the resulting PCR reaction is digested with Pstl, blunt ended with an exonuclease, digested with Bglll, purified and 
ligated to pXT1 , now containing a poly A addition sequence and digested Bglll. 

The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand Island, 
New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G41 8 (Sigma, St. Louis, Missouri). The protein is preferably released into the supernatant. 
However if the protein has membrane binding domains, the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product, 
synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA. 

Alternately and if antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the 
globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript, and the potyadenylation signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 
texts such as Davis etai, cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene, Life Technologies, Inc., or Promega. Polypeptides of the invention also may be pro- 
duced using in vitro translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding, one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 
scope of the invention. 

All patents, patent applications and publications referred to above are hereby incorporated by reference. 
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Table 4 



one 


ccn in kin 


Dl ACT 

BLAST 


Antigenic Regions 










LIAIIAI r\f* 

HUMULUb 


Region 1 1 


Region 2 




Qo n't An A. 


168_6 ! 


5192 


lipoprotein 


36-45 


84-103 


1 DC* 1 O 1 


1 f D* 1 OD 


238_1 I 


pig) 
5193 


chrA 


21-39 


48-58 


R4-QC. 


CDC'CHj 


51_2 i 


5194 


OppB gene product (B. sub* 


20-36 


70-79 


1 UU* 1 1 C 


It 1*1 J 1 


278.3 j 


5195 


lipoprotein 1 


20-29 


59-73 


QC Q7 


1 C7 171 

i bt- 1 71 


276.2 


5196 


lipoprotein 


21-33 


65-74 


1 77 1 AC 


91 1 *J O A 

CI 1 -ecu 


45_4 


5197 


ProX i 


28-37 


59-69 


Q C.I AA 


1 9 A_l 50 


31 6_8 


5198 


hypothetical protein 


45-54 


88-97 


1 Q9 IflJ 

1 oc-lUt 


CHD-CDD 


154.15 


5199 


* unknown ! 


31-40 : 


48-58 


79-00 


AC 1 Aji 

95-104 


228.3 


5200 


! unknown ! 


25-38 ! 


40-52 


CA ~7 A 

b4-74 


DA OO 


228.6 


5201 


1 unknown i 


29-41 ! 


89-101 


1 £o-14o 


1 7*3 1 O A 

I 7o-lo4 


50.1 


5202 


: unknown 


21-33 


52-61 


ICQ 1 O 7 


1 97-206 


112.7 


5203 


'iron- binding periplasmic 


21-31 i 


58-67 


O 7 1 A1 


111 1 T» A 

i n-120 


442.1 


5204 


! unknown ' 


30-39 1 


91-100 


1 CC- \ Df 


182-192 


66.2 


5205 


; unknown 


50-59 ! 


1 04-1 1 6 


1 77 1 ?C 
I C\( m \ DO 


1 C7 tB9 

1 b7- lot 


304_2 


5206 


?Q-bindinq periplasmic i 


19-28 i 


48-57 


7t OA 

Z5:55 


1 11C 

103-116 


: 44.1 


5207 


■hypothetical protein 


27-36 ! 


86-95 


1 7A 1 oo 

1 Z9-1 So 


1 9Z-ZQ1 


161.4 


5208 


:SphX 


27-44 | 


149-161 


ice i 7 c 
i Ob- 1 75 


O A1 O 1 A 

201-Z 10 


46.5 


5209 


cmpC (permease) 


21-33 1 


61-70 


OO ft I 

83-92 


1 AA 1 AA 

100-109 


942.1 


5210 


ItraH [Plasmid pSK41] ! 


83-92 i 


109-1 18 


1 27-142 




5.4 


5211 


:0RF (S. aureus) ! 


12-22 


87-96 


1 1 1-1 20 


*l C 1 1 CA 

1 5 1-1 bO 


20.4 


5212 


. peptidoglycan hydrolase (S: 


24-34 


129-138 


141-1 50 


1 C 1 171 

I Dl-171 


328.2 


5213 


: lipoprotein (H. flu) i 


81-90 


123-133 


7AA 7AA 

Z90-Z99 




520.2 


5214 


ifibronectin binding protein : 


44-54 


63-79 


o 1 oa 
81-90 


QC lift 

95-1 10 


771.1 


5215 


! emm1 qene product (S. py< 


30-39 


65-82 


t OC 1 AC 

! 9b-10b 


1 1 7 17 1 
1 1 t-1 tl 


999.1 


5216 


predicted trithorax prot. (D 


7-16 


120-129 


! 1 C 7 1 CC 

1 157-lbb 




853.1 


5217 


ORF2136 (Marchantia polyr 


43-52 


88-97 


t 1 a ~> ill 
t lUZ-lll 




287.1 


5218 


psaA homolog 


1 3-22 


28-44 


1 77 o 7 


1 1 1 A 1 IX 

! 114-124 


288.2 


5219 


cell wall enzyme 


1 4-23 


89-98 


I 




596.2 


5220 


penicillin binding protein 2b 


40-49 


59-68 


t 7 C 07 

! 7b-o7 


i 1 AC 11C 

! lUb-115 


217.5 


5221 


fibronectin/fibnnogen btndn 


28-37 


40-49 


1 C 7 71 

! bc-71 


1 QO 111 

1 95-1 1 1 


217.6 


5222 


fibronectin/fibrinogen bp : 


10-19 


31-40 


1 C A C7 

\ 54-bZ 


i 73 GO 

! ( j-9c 


528.3 


5223 


myosin cross reactive protc 


4-13 


29-47 


1 bU- f D 




171.11 


5224 


EF 


7 A 7 1 1 


Q.1 1 1 A 


1 





CO A 


DC CD 


■penicillin Dinoing proiein to 


1 7 7 1 1 


oy-oo 


1 95-104 




353_c 


C9 7C 

DC. CO 




AC C C 1 


C7 71 


I 

I. _. 


j 




DCCf 


. 7Q l> V\ *\ nm+Ain tts flfnA rant* 

Cs KUa protein in nmn reyi* 


CD'DC 


CO 7Q 


j 94*103 


: 175-184 


'S A 7 A 


1 DCCO 


IWItCnlny mutliliy 


10-19 


48-60 


' 83-92 


1 111-121 


69j3 


i 5229 


arabinogalactan protein 


97-106 


132-141 


158-167 


: 180-189 


70.6 


: 5230 


' nodulin 


36-45 


48-57 


1 137-160 


; 179-188 


129.2 


i 5231 


' glycerol diester phosphodie 


8-17 


41-50 


1 55-74 


97-106 


58.S 


1 5232 


PBP (S. aureus) 


26-35 


70-79 


; 117-126 


; 152-161 


188.3 


5233 


MHC class II analog (S. aure 


72-81 


94-103 


115-124 


i 136-145 


236.6 


: 5234 


histidine kinase domain (Die 


24-33 


52-67 


81-94 


: 106-121 


310.8 


5235 


clumping factor (S. aureus) 


59-71 


77-86 


! 93-102 


! 118-127 


601.1 


5236 


novel antigen/0RF2 (S. aui 


45-54 


91-104 


108-117 


! 186-195 


544.3 


5237 


ORF YJR151c(S. cerevisae! 


76-90 


101-111 


131-140 


154-164 


662.1 


5238 


MHC class II analog (5. aure 


22-32 


71-80 


89-98 


114-122 


87.7 


5239 


5' nucleotidase precursor (' 


29-45 


62-71 


i 105-114 


125-137 


120.1 


5240 


B65G qene product (B. sub 


102-111 
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Table 4 



ORF 




Antigenic i 


Regions 


(cent) 








Region 5 


Region 6 


Region 7 


Region 8 ! 


Region 9 


Region 10 


168.6 


244-272 


303-315 


1 i • : 


238.1 


260-269 


291-301 


308-317 ! I 


51.2 


140-152 


188-208 ! 


211-220 j 


256-266 i 


273-283 ! 




278.3 


198-209 : 


i 




i 






276.2 


255-268 


! 


i 


1 ■ i 


45.4 


177-199 


221-230 ! 


234-243 


268-279 ! 


284-293 


304-313 


316-8 i i 




154.15 


148-157 


177-187 


202-211 


i = 
i 


228.3 


101-119 . 


139-154 


166-181 






228.6 








i 
t 


50_1 












112.7 


136-149 


197-211 


218-229 


253-273 


i 
i 


442.1 


199-210 : 


247-257 


264-277 


287-309 1 




66.2 












304.2 


178-187 


250-259 










44.1 


i 










161.4 ; 






i 

_ 1_ 


46.5 


131-141 ■ 


162-176 


206-215 


243-252 


264-273 


285-294 


942_1 ; 






i 

! 


5.4 


189-205 


230-239 


246-264 


301-318 


340-354 


378-387 


20.4 


i 202-212 


217-234 


260-275 


314-336 


366-373 


380-391 


328_2 








1 


520.2 










- 771.1 


i 145-154 










999.1 i 






I 


853_1 ! 






t 


287.1 


1 154-164 








1 


288.2 I 








596.2 


; 121-130 






: i 


217.5 


244-253 


259-268 


288-297 


302-311 i : 


217.6 


! 1 44-1 58 


174-183 


188-197 


207-216 


226-242 




528.3 




: I 


171.11 




i ! 


63.4 i 


1 




353_2 ! 


1 


i 


74? 1 


! 1 97-207 




1 ! - ! 


342_4 i 


i i ■ 1 


69.3 


195-211 




1 


70.6 


: 206-215 


263-272 


i 291-301 


i 331-340 


358-371 


1 390-414 


129.2 


117-127 


141-157 


■ 168-183 


. 202-211 


222-231 


= 261-270 


58.5 


1 84-203 


260-269 


: 275-299 


; 330-344 


372-381 


; 424-433 


188 3 i i ; 


236.6 


138-147 


163-172 


! 187-198 


i 244-261 


268-278 


308-317 


310.8 


: 131-140 


144-153 


! 177-186 


I 190-199 


i 204-213 


216-227 


601.1 


! 208-218 




1 
i 








544.3 


. 170-179 


184-193 


! 224-235 


I 274-287 


327-336 


352-361 


662_1 




87.7 ! 


: i 


120.1 





208 



EP0 786 519 A2 



Table 4 



ORF 


Antigenic 


Regions |(cont) ! 




Region 1 1 Region 1 2 


Region 13 ! Region 14 . Region 15 1 Region 16 


168.6 


I I = i 


238.1 


I I i 


51.2 




278_3 




276.2 




i ' i 


45.4 




i : 


i 

i 

i 


315.8 






154_15 1 i 




! 




228.3 ! i 




; 




228.6 


i 




! 




50.1 


I 




1 




112.7 


1 




! : 




442.1 


i 




i 




66.2 


i 




i 




304.2 










44.1 


r ■ — -_ 

1 


1 





161.4 ! ! 






46.5 I 306-315 ; 








942.1 i ; 




1 

1 




5.4 ! 393-407 \ 416-426 


456-465 


1 




20.4 1 396-405 : 410-419 


461-481 






328.2 


1 




1 




520.2 


i 
1 




! 

! 




771.1 


1 








999.1 










853.1 










287.1 








288.2 ! | 






596.2 i I 1 




217.5 1 1 






217.6 i 








528.3 j 








171.11 j 








63.4 I 








353.2 I 








743-.1 ! : 






342.4 i 








69.3 ! 


j 


70.6 ! 453-471 506-515 , i 


129.2 i 296-315 j | 




58.5 • ' i : 


188.3 ' ! | i 


236.6 ! 358-377 410-423 i 428-439 !442-457 467-476 ! 480-493 


310.8 I 238-251 256-275 i 281-290 1296-310 314-333 3^8-347 


601.1 i ; I | 


544.3 1 t : 


662.1 ! ! I 


87.7 I j 


120.1 i 
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Table 4 



ORF 


Antigenic 


Regions (cont) 








Region 1 7 Region 1 8 


Region 19 Region 20 


Region 21 


Region 22 


168_6 




: ! 




238_i i : 


Sl.2 • \ ' ; 


278.3 ) ! i j i 


276.2 i ! ! { \ 


45.4 1 ! i i ! 


316_8 










154.15 1 


i i 




228_3 ; 


i 






228.6 ; 




i 


50.1 ; 








11 2.7 1 


! 






442.1 




i 
I 






66.2 


i 


i 






304.2 




t 






44_1 I 


t 


i 


161.4 j 


* 


I 


46.5 


1 ! 




942.1 i ! I 




5.4 ! ! 




20.4 j 




l 


328.2 | 


i 






520.2 I 


< 






771_1 ! j 






999.1 ! 








853.1 I 




i 


287_1 j 




I 


288.2 1 






596.2 j 






217.5 






217_6 ! 






528.3 ! 


• 






171.11 


i - 


63.4 s 1 1 ! 


'353.2 | 1 • I ! 


743^1 i 


i 


342.4 I 


i 


693 : : ! ! . 


70.6 ! ! ! 


129.2 ; 1 ! 


58.5 ; ! 


188.3 i 


236.6 ! i ■ 


310.8 


357-366 370-379 


429-438 443-452 


478-487 


551-560 


601.1 










544.3 










662_1 


87.7 










120.1 
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Table 4 



5 ORF j Antig enic. Regions l(cont) j 

l Region 23 Re g ion 24 ! Region 25 ! Re gio n 26 Re g ion 27 i Re gion 28 

168,6 ! : j ! " i !_ J _2 

238.1 : "j j j T_ ~ 

51.2 i J ! : i "_" ' 

w 278.3 j . . ' J."_IL\ 

276.2 ! j [ i | _ 

_ 45. 4 I i j _ ; j 

316.8 : ] _ JH" i 

1 54.15 1 ; \_ I I I 

15 228.3 : J [ PL J 

228.6 ' : I ~~ ! 

. 50,l" ! ; I '__ j 

1 1,2 .7 I j 

442.1 | i I j " l.Ll._ 

20 6 6.2 i j J I 

_ 304.2 ! _ I J - j ! 

"4 4. j i ; 1 ; 

161. 4 ' _* ! " j ] 

46.5" j ' "Z ! _ ~ 11 Z. i ] ' 

25 942_ 1 j ' Z C _ ' _^ i 

5,1 i : 1 _ j ! I 

20.4 j j 1 " | I 

328.2 ! I 1 ~ 

520.2 1 :1 1' 

30 771.1 i t ; j • 

999.1 ! ■ , I j J 

853.1 i , rz ; i : 

287.1 ! i : _ 

288_2 ' 1 I _; 

35 596.2 ! ; I L -1 JL J 

217.5 ! j J ; j 

217.6 ; Z ! | J 

528.3 | ' j ' j 

171.11 1 " I I ; 

40 63.4 ! ! i 1 1 1 

■ pO i - i i i 

743.1 : ! j 

342.4 ■ : : j ~ 

69.3 : ; j , 

45 70. 6 ; 

129. 2 i : ± 

58.5 . : : 

188.3 | : : :_ 

236.6 : j ! i 

so 310.8 622-632 670-685 ,708-718 823-836 858-867 ;877-886 

601.1 \ r ! 

544. 3 ■ I i j j I 

662 ^1 I I ; 

87. 7 " 1 ! : | 

55 1 i2o.i * [ ■ : 
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Table 4 



ORF Antigenic; Regions 


(cont) 


Region 29 


Region 30 




168.6 j j 


238.1 j i 


51 _2 i : 


278.3 i i 


276.2 : i j 


45.4 i 






31 6_8 






154.15 • 






228.3 ! ~1 


r~ 




228.6 






so_i ; 




1 1 2_7 






442.1 ! 






66.2 i 






304.2 






44_1 






161.4 • | 




46.5 ! I 




942.1 i 




5.4 ! 




20.4 ; ! ! 


328.2 ' i 


520.2 - 




771.1 




999.1 ! ; 


853.1 i 1 




287_1 : 




288.2 i 




596.2 I 




217.5 ■ ! ! 


* 217.6 i 


i 


528.3 f ! 


171.11 ! i ! 


63.4 I i I 


353.2 - ! | 




74^1 ! 




342_4 1 j 




6913 ! 1 


70.6 ! 


129.2 ! 


58.5 


188.3 


236.6 i 


310.8 






601.1 


544.3 


662.1 


87.7 


120.1 
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Table 4 



ORF 




BLAST 


Antigenic i Regions 










HOMOLOG 


Keyion i 


Kegion c 


Region 3 


R&aion 4 


46.1 


15241 


aldehyde dehydrogenase 


ft 1 7 


Jb-jc 


83-96 


* 1 1 2-1*27 


63_4 


5242 


glycerol ester hydrolase (P. 






93 1 07 


1 23~1 33" 


174.6 




5243 ketooantoate hvdroxvmeth 








265-274 


206_J6;5244 


ornithine acetyltransferase 


1-10 


~~3~4-43 


54-63 


194~210 


267.1 


i5245 


NaH-antiporter protein (E. r 


120-129 


332-347* 


398-408 




322_1 15246 


acriflavin resistance protein 


58-75 


153-164 


203-231 


264-284 


415.2 


;5247 


transport ATP-binding prou 


108-126 


^218-227 , 


298-308 


315-334 


214_3 


!5248 


2-nitropropane dioxygenase 


123-136 


216-233 


283-292 


297-306 


587.3 i5249 


clumping factor 


5-14 


43-54 


59-68 


76-95 


685_1 


;5250 


signal peptidase 


59-68 


72-81 


86-95 


_._9~9-108 


54.3 


15251 


fibronectin binding protein 1 


23-32 


37-46 


50-59 


89-98 


54_4 


"'5252 


fibronectin binding protein 1 


43-52 


66-75™-' 


"-j§J04~r 




54ls 


15253 


fibronectin binding protein 1 


49-60 


81-90 






54.6 


;5254 


fibronectin binding protein 1 


55-71 


82-97 


139-158 
96-105 


_A?5:18_6_V 


328_1 


; 5255 


lipoprotein (H. flu) 


11-20 


61-70 





Table 4 



ORF 








Antigenic; Regions ' 


(cont) 








Region 5 






Region 6 i 


Region 7 ! 


Region 8 


Region 9 


Region 1 0 


46.1 


215-242 






333-352 ! 


376-385 1 


416-432 


471-487 




63.4 


' 145-154 






191-202 


212-223 


245-265 


274-283 


291-300 


174.6 










. - J 








206.16 


239-259 


i 


275-284 




' 






267.1 


















322.1 


. 298-319 






350-359 










415.2 


344-353 






371-380 


395-404 


456-465 


486-495 


. 518 : 527~"1 


214.3 


i 318-337 


— 




"365-375 










537.3 


"i 1 06-1 1 5 






142-151 


156-166 


173-182 


186-198 


204-213 


685.1 


: 113-122 






130-145 










54.3 


'! 128-138 






185-194 


217-226 


251-260 


268-277 


295-305 


54.4 


I 175-188 


--1 




191-200 


203-212 


220-229 






54.5 


1 














54.6 


i 220-230 




287-304 


317-326 


344-353 


364-373 


378-387 


328_1 ! 
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Table 4 



ORF 


i 


Antigenic! Regions i(cont) 








Region 1 1 


_ Region 12 


Region 1 3 


Region 1 4 


Region 1 5 


jtegion .17 


46.1 












63.4 


306-315 


319-328 


366-376 


395-420 


453-462 


_467-476 


174.6 














206_16 








! i 


267.1 ! i ! 


i ! 


322.1 i i 








415.2 


539-555 










214_3 ; 1 ! 






i 
I 


587.3 


217-226 


278-287 


318-327 


332-342 


351-360 


j 377-386 


685.1 












i 


54^3 


316-325 


329-345 


355-372 


387~396 






54.4 














" 54.5 


i ■ 








"i " 


54.6 


396-407 


427-436 


514-531 


541-550 


569-578 


1612-622 


328.1 


■ i 


1 1 



Table 4 



ORF 




Antigenic! Regions 


i(cont) 








Region 18 


Region 19 ; 


Region 20 




Region 21 


Region 22 


Region 23 


46.1 i ' 




63.4 


485-500 


513-525 












174.6 ; 








206_16 




■ 






- 








267.1 














322.1 
















415.2 








_ 








214.3 
















587.3 


396-405 


426-442 " 


459-470 




485-494 


505-514 


531-562 


685.1 


i 








54.3 


455-462 


472-491 


517-536 






i 


54.4 












54.5 












t 


54.6 


639-648 


673-681 


703-715 


723-732 


749-760 


|7>2^88 


328.1 













so 
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Table 4 



ORF 


Antigenic Regions 


(cont) 


i 




Region 24 \ Region 25 ! Region 26 


Region 27 


Region 28 


_Regiqn_29_ 


"~~46_T 






63_4 i i 






174.6 


i 







- - .. 




206.16 


; 








267.1 


: i i 








322.1 


: i i 







_____ 


415.2 ! 






214.3 











587.3 


567-578 : 584-601 j 607-840 


844-854 


858-870 


L 877^886 


685.1 j j 






54.3 1 i 








54.4 | i 








54_5 ; 1 ! 








54.6 


793-802 1811-826 834-848 


866-876 


893-903 


907-918 


328.1 ' 1 ! 
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Table 4 



30 



35 



40 



ORF 



I Ant igenic Regions ;/cont)_ 



46.1 



63.4 



_Region_30 _ Region 31 J 



174.6 
206_1 6 T 
267.1 I 



322.1 
415.2 



214.3 
587.3 " 
685. 1 
_54 t _3_~ 
54.4 
54.5 



,8 89-911 1 927-9 36 

1 i 



54.6 
328.1 



: 925-944 ; 951-997 



45 



50 



55 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: US 

(F) POSTAL CODE: 20850 

(ii) TITLE OF INVENTION: Staphylococcus aureus 
nucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 5255 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 

(B) FILING DATE: 05-JAN-1996 



(2) INFORMATION FOR SEQ ID NO : 1 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 5895 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



10 


<xi) , 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


1 : 




TCCATTATGA 


AGTCACAAGT 


ACTATAAGCT 


GCGATGTTAC 


CAATGTTTTT TAAAATCCCA 


60 




GTAATAAAAT 


CAAAAAATAA 


GTTAAATAAT 


GTATTCATTT 


TAAGTCCTCC TTAATAAAGa 


120 


15 


aaataGGTAA 


TAATGTAATA 


GCTTCTATTA 


TGATGCCTAA 


TTGAATGAAT TGGGCAAATG 


180 




GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC GCATAATATT 


240 




TTTTTCGTTT 


AATAAGTCGC 


ACAGGAATGG 


GCTTCTTTTT 


AGTTGCTGCA GGAGCATATA 


300 


20 


CTGAGATTAC ACCTAAAGAA ATAACTGTTA AAATAATCAT AATTAAAAAG TTAATATGAA 


360 




AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 


420 


25 


AAGAAGAAGG 


TGCATGTGCa 


CCATGTGCAT 


GtCTTCTTAT 


TAAATAAAAT GTTAAATTCG 


480 


TAATTAACGT AAACAGAAAA ATGTTTAAAA 


TATAGGCAAT 


AGTATACATA ACAATTAATT 


540 




TACCTATATT 


TTTAGCTAAG 


ACCTGCATCC 


CTAATCGTAC 


TTGCAAAAAT TGAATATGAT 


600 


30 


CTAAGTTATT 


TCTCTTTTGA 


AGATACGTGG 


CAAACTGGTC 


AATTTTATTA TCAAAATAAT 


660 




TCAATTTTAC 


ACCACTCTCC 


TCACTGTCAT 


TATACGATTT 


AGTACAATCT TTTATCATTA 


720 




TATTGCCTAA 


CTGTAGGAAA 


TAAATACTTA 


ACTGTTAAAT 


GTAATTTGTA TTTAATATTT 


780 


35 


TAACATAAAA 


AAATTTACAG 


TTAAGAATAA 


AAAACGACTA 


GTTAAGAAAA ATTGGAAAAT 


840 




AAATGCTTTT AGCATGTTTT 


AATATAACTA 


GATCACAGAG 


ATGTGATGGA AAATAGTTGA 


900 


40 


TGAGTTGTTT 


AATTTTAAGA ATTTTTATCT 


TAATTAAGGA AGGAGTGATT TCAATGGCAC 


960 


AAGATATCAT 


TTCAACAATC 


GGTGACTTAG 


TAAAATGGAT 


TATCGACACA GTGAACAAAT 


1020 




TCACTAAAAA 


ATAAGATGAA 


TAATTAATTA 


CTTTCATTGT 


AAATTTGTTA TCTTCGTATA 


1080 


45 


GTACTAAAAG 


TATGAGTTAT 


TAAGCCATCC 


CAACTTAATA ACCATGTAAA ATTAGCAAGT 


1140 




GAGTAACATT 


TGCTAGTAGA 


GTTAGTTTCC 


TTGGACTCAG 


TGCTATGTAT TTTTCTTAAT 


1200 




TATCATTACA 


GATAATTATT 


TCTAGCATGT 


AAGCTATCGT 


AAACAACATC GATTTATCAT 


1260 


50 


TATTTGATAA 


ATAAAATTTT 


TTTCATAATT 


AATAACATCC 


CCAAAAATAG ATTGAAAAAA 


1320 




TAACTGTAAA 


ACATTCCCTT 


AATAATAAGT 


ATGGTCGTGA 


GCCCCTCCCA AGCTCGCGGC 


1380 




CTTTTTTGTA ATGAAGAAGG 


GATGAGTTAA 


TCATCATTAT 


GAGACCCGCC GTTAAAATAT 


1440 
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TCATTTGCAA AGGGCGAAAT GGGTTCTTAC 

GACTTATGAA AAATCTCTCA TAAATCTATG 

5 

CGGGCGCTTC TTATTTATAC AAATCTAATT 

GTTGCTGTTC TACTTCATTT AAGTTTAAAT 

ATTCTCCAAC TAAATCTCCA TTTGGGTTTA 

10 

TACCATCGAA TCCAGTGCTA TTAGTTCCAA 
CCTTTAGTAA TGAATGCCAA TGTTGAAGAC 

75 CAATTTTAGC ACCACTACGA GCAGGATATC 
AGATAAGTTG GGTCACATAA GTACCGTCAG 
CAGCGGTTAA AAATTCATGC TCTCTTAACA 

20 TAATCAGCTG GCCACTTTTA TTCACACTAA 

TGTTAGAAAC TGACCCAGCT ACGATATCGA 
ATGAAAAACT TTGTCCTAGA TTATTATCTG 

25 

CATTATTCCA CATTTCAGGT AAAACGACTA 
ACCATTGCGT TATTTGAGTT TCATTTTTAG 

30 AAATTTGGAC TTTCATAACA TCACATCCTT 
TATGTTGAAA CGCAAAAAAC GAGCACAAGA 
TTATATTGAC AGTAGTTGAT GGGGCCCCAA 

35 GACAATGCAA GTTGGGGTGG GCTCTAACAT 
TTTCTTATAC ATGAGTTTTA CTCATGTATT 
AATGTGTAAG AACTACTACA TAATGAATAA 

40 

CCTAACAATA TATTGATTAT TTTTTTATTA 
TTTCGCCAGC AGCTTCACGA ATATCACCAA 
TAGGAATATT AAATTCATTT GAAGTCATCT 

45 

AAGCACCTAT GCCTTTAGTA GCTAATGCAG 
TTTGAGTTGA CCATATTGCA AAATTATCAT 
SO TTACAACATC TTGATCTTCA TAAAACAAAA 
TTTTTTGTTC AGTTGGCTCG AAATCACGAT 
TTGTGTTATC CCAAAATTTA TTATTGTTGT 

55 



TGAGTTATCT ATTATAAAAA AATAAACATA 1560 

TTTAGTCATG aCATGTGTTA AATATTATTT 1620 

TAATACTTTT AAATACAGGT ATATTTTCgC 1680 

CTACAGTCAA AATATCTGCG GATTCATTTA 1740 

TAACTATCGA ATGACCAGCA TATTCTGTGT 1800 

TGACAAACAT ATTATTTTCA ATTGCACGTG 1860 

GTGACATAGG CCATTGCGCC ACATAAAATG 1920 

TTAATAATTC TGGAAAACGT AAATCATAAC 1980 

ACAATTGAAA GGGTTCAGCT ACGTATTCGC 2040 

TAGGAACTAA ATGAACTTTG TCGTATTCaT .2100 

AAGCTGTATT AAATATTTGA TTGTTTCTAA 2160 

CTTTATATTT TTCAGCTAAA TGTTTAATAA 2220 

CTTTTTCATT TAAATGCTCT AAATCATAGC 2280 

CATCTACTTC AGCATTCATA TTTTTTTCGA 2340 

AACTATCTCC AAAAACAATC GGTAATTGAT 24 00 

GATAGATCTT ATATATAACT TACTAAAAGT 2460 

CATAAAATCA AAGTCCTAGG CTCTACAAAG 2520 

CATAGAGAAA TTGGAACACC AATTTCTACA 2580 

AAAGAAATAC TTTTTCTTTA GAAATTAGTA 2640 

CCTATTCTTA AGTGCACATT AGCAGCGGCT 2700 

CTAATGATTC TTTATCATTT CTGTCCCATT 2760 

CGAAACGATC TTCCACTGGA TTAAATGTTT 2820 

ATGGCATTTG AGCAATAAGT TTCCAACTTT 2880 

CATCAACAAG TGGATTATAG TGTTGTAATG 2940 

TCCAAATTGC AAATTGATGC ATGGCATTTG 3 000 

AGTAGTTTGG CATTTGTTCT TGTAAACCAC 3060 

TTGTACCGTA TGAATGTTTG AAGTTATCAA 3120 

TCTCTCCCAT GACTTCTTTT AAAATTGCTT 3180 

CATTTAACAA GAGAACAATT CTAGTTGATT 3240 
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CATCGCTAAT TGATATCGAA TCTTTCAAAT TATATATTGA ACGTCTTTCT TCCATTGCAT 3360 

TGTCAAAAGT CATTGCTTTT TTATCTTTTT TAAATAAGCC CATAATTATT GCTCCTTCTT 3420 

TAGTAAAGAA TACTTAATAG ACTAAGTATA AAATTTATAC TCGTACTTGT AAAGCAATAT 3480 

TTACGAAAAT TTCAAGAATA TTAATATTCA TTTTCAAATT CCAAATATAA ATGCATTTTC 3540 

AACGCATATT TATTATACTT AGATTAATAC TTACATGAAA AAGGGAGGTG TCTCGTGAAA 3600 

TGTCATATCA TTGGTTTAAG AAAATGTTAC TTTCAACAAG TATTTTAATT TTAAGTAGTA 3660 

GTAGTTTAGG GCTTGCAACG CACACAGTTG AAGCAAAGGA TAACTTAAAT GGAGAAAAAC 3720 

75 CAACTACTAA TTTGAATCAT AATATAACTT CACCATCAGT AAATAGTGAA ATGAATAATA 3780 

ATGAGACTGG GACACCTCAC GAATCAAATC AAACGGGTAA TGAAGGAACA GGTTCGAATA 3840 

GTCGTGATGC TAATCCTGAT TCGAATAATG TGAAGCCAGA CTCAAACAAC CAAAACCCAA 3900 

GTACAGATTC AAAACCAGAC CCAAATAACC AAAACTCAAG TCCGAATCCT AAACCAGATC 3 960 

CAGATAACCC GAAACCAAAA CCGGATCCAA AACCAGACCC AGATAAACCA AAGCCAAATC 4020 

CGGATCCAAA ACCAGATCCA GATAACCCGA AACCAAATCC AGATCCAAAA CCAGACCCAG 4080 

ATAAACCAAA GCCAAATCCG GATCCAAAAC CAGATCCAGA TAAACCAAAG CCAAATCCGA 4140 

ATCCAAAACC AGACCCTAAT AAGCCAAATC CTAACCCGTC ACCAGATCCC GATCAACCTG 4200 

GGGATTCCAA TCATTCTGGT GGCTCGAAAA ATGGGGGGAC ATGGAACCCA AATGCTTCAG 4260 

ATGGATCTAA TCAAGGTCAA TGGCAACCAA ATGGGAATCA AGGAAACTCA CAAAATCCTA 4320 

CTGGTAATGA TTTTGTATCC CAACGATTTT TAGCCTTGGC AAATGGGGCT TACAAGTATA 4380 

35 ATCCGTATAT TTTAAATCAA ATTAATAAGT TGGGCAAAGA TTATGGAGAA GTTACTGATG 4440 

AAGACATTTA TAATATTATT CGAAAACAAa ATTTCAGCGG AAATGCATAT TTAAATGGAT 4500 

TACAACAGCA ATCGAATTAC TTTAGATTCC aATATTTCAA TCCATTGAAA TCAGAAAGGT 4560 

ACTATCGTAA TTTAGATGAA CAAGTACTCG CATTAATTAC TGGTGAAATT GGATCAATGC 4620 

CAGATTTGAA AAAGCCCGAA GATAAGCCGG ATTCAAAACA ACGCTCATTT GAACCGCATG 4680 

AAAAAGACGA TTTTACAGTA GTTAAAAAAC AAGAAGATAA TAAGAAAAGT GCGTCAACTG 4740 

CATATAGTAA AAGTTGGCTA GCAATTGTAT GTTCTATGAT GGTGGTATTT TCAATCATGC 4800 

TATTCTTATT TGTAAAGCGA AATAAAAAGA AAAATAAAAA CGAATCACAG CGACGATAAT 4 860 

SO CCGTGTGTGA TTCGTTTTTT TTATTATGGA ATAAAAATGT GATATATAAA ATTCGCTTGT 4920 

TCCGTGGCTT TTTTCAAAGC CTCAGGATTA AGTAATTGGA ATATAACGAC AAATCCGTTT 4 980 

TGTAACATAT GGATAATAAT TGGAACAGCA AGCCGTTTTG TCCAAACATA TGCTAATGAA 5040 

55 
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AATATTAATG AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 5160 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 5220 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TAGCTTTTCT 5280 

GTATTAGGAC TTACTTGTTG TCCACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 5340 

TGATAAATCA TTACCAATGC AAATCCAAGC AATGCCCATG GAATGATATA TTTTTTAGGT 5400 

TCTTTAACTT CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC 5460 

GTCGTGGCGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 5520 

75 ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA GAAGCGGTAA AAATTGAGAC 5580 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC tTGTCATAAT TTTCCTCCAA 5640 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 5700 

20 GTTATGACTT GAAATTTTGA CCAAATTTGA TTATTATAAA TGTATGTTAG CACTCTTTAA 5760 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

ATCGTGTGAT TATTGAGAAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

TGATAGTGCT AAAGA 5895 
(2) INFORMATION FOR SEQ ID NO: 2: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTTGAAAAAA CAAGGTACGA TTGGTTTAAT AACATATATG AGAACCGATT CTACACGTAT 60 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 120 

CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 180 

ACCTTCAAGT ACTATGCGTA CGCCAGATGA TATGAAGTCA TTTTTGACGA AAGACCAATA 240 

CCGATTATAC AAATTAATTT GGGAACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 300 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 360 

50 AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 420 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 480 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 540 

55 



220 



EP0 786 519 A2 



w 



20 



25 



AAAGCGTAAC TATGTCAAAT TAGAAAGTAA GCGTTTTGTT CCTACTGAGT TGGGAGAAAT 660 

AGTTCATGAA CAAGTGAAAG AATACTTCCC AGAGATTATT GATGTGGAAT TCACAGTGAA 720 

TATGGAAACG TTACTTGATA AGATTGCAGA AGGCGACATT ACATGGAGGA AAGTAATCGA 780 

CGGTTTCTTT AGTAGCTTTA AACAAGATGT TGAACGTGCT GAAGAAGAGA TGGAAAAGAT 840 

TGAAATCAAA GATGAGCCAG CCGGTGAAGA CTGTGAAATT TGTGGTTCTC CTATGGTTAT 900 

AAAAATGGGA CGCTATGGTA AGTTCATGGC TTGCTCAAAC TTCCCGGATT GTCGTAATAC 960 

AAAAGCGATA GTTAAGTCTA TTGGTGTTAA ATGTCCAAAA TGTAATGaTG GTGACGTCGT 1020 

15 AGAAAGAAAA TCTAAAAAGA ATCGTGTCTT TTATGGATGT TCGAAATATC CTGAATGCGA 1080 

CTTTATCTCT TGGGATAAGC CGATTGGAAG AGATTGTCCA AAATGTAACC AATATCTTGT 1140 

TGAAAATAAA AAAGGCAAGA CAACACAAGT AATATGTTCA AATTGCGATT ATAAAGAGGC 1200 

AGCGCAGAAA TAATATTTTT ATTTCCTAGA TACATTTTAA GATTGTTAAA TAGAATCATT 1260 

AGTGAATCTT ATTTTAAAGA TAGTAAAGGA TTAATCTAAA TAAGTGCGGA TAATATAAAC 1320 

ATAACAACAT AATTAAmAGA CATAAATGAC aATAAAAGGA GTATAGAAAT GACTCAAACT 1380 

GTAAATGTAA TAGGTGCTGG TCTTGCCGGT TCAGAAGCGG CATATCAATT AGCTGAAAGA 1440 

GGAATTAAAG TTAATCTAAT AGAGATGAGA CCTGTTAAAC AAACACCAGC GCACCATACT 1500 

30 GATAAATTTG CGGAACTTGT ATGTTCCAAT TCATTACGCG GAAATGCTTT AACTAATGGT 1560 

GTGGGTGTTT TAAAAGAAGA AATGAGAAGA TTGAATTCTA TAATTATTGA AGCGGCTGAT 1620 

AAGGCACGAG TTCCAGCTGG TGGTGCATTA GCAGTTGATA GACACGATTT TTCAGGTTAT 1680 

ATTACTGAAA CACTTAAAAA TCATGAAAAT ATCACAGTTA TTAATGAAGA AATTAATGCC 1740 

ATTCCAGATG GATACACAAT TATCGCAACA GGACCACTTA CTACAGAAAC CCTTGCG CAA 1800 

GAA^TAGTGG ACATTACTGG TAAAGATCAA CTTTATTTCT ATGATGCGGC TGCTCCAATT 1860 

ATTGAAAAAG AATCTATTGA TATGGATAAA GTTTACTTAA AGTCCCGTTA TGATAAAGGT 1920 

GAAGCTGCAT ATTTAAACTG TCCTATGACT GAGGATGAAT TTAATCGCTT TTATGATGCA 1980 

45 GTATTAGAAG CTGAAGTTGC GCCTGTAAAT TCATTTGAAA AAGAAAAATA TTTCGAGGGT 2040 

TGTATGCCTT TTGAAGTAAT GGCAGAACGC GGACGCAAGA CATTACTATT TGGACCAATG 2100 

AAACCAGTAG GATTAGAAGA TCCAAAGACT GGGAAACGTC CTTATGCGGT GGTTCAATTA 2160 

AGACAAGATG ACGCTGCTGG TACACTCTAC AATATTGTTG GCTTCCAAAC GCATTTAAAA 2220 

TGGGGAGCTC AAAAAGAAGT CATTAAATTA ATTCCAGGCT TAGAAAATGT TGATATTGTT 2280 

AGATATGGTG TGATGCATAG AAATACCTTC ATTAATTCAC CGGACGTATT AAACGAGAAA 2340 
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TATGTAGAAA GCGCAgcTAG CGGCTTAGTT GCAGGTATCA ATCTTGCGCA TAAAATATTA 2460 

GGCAAGGGTG AGGTAGTATT TCCGAGAGAA ACAATGATTG GAAGTATGGC TTACTATATT 2520 

TCTCATGCTA AAAACAATAA GAATTTCCAA CCTATGAATG CTAACTTCGG GTTATTACCA 2580 

TCTTTAGAAA CTAGAATTAA AGATAAAAAA GAACGCTATG AAGCACAAGC TAATAGAGCT 2640 

TTGGATTACT TAGAAAATTT CAAAAAAACT TTATAAAATA GTTAGAAAGA CTAGATATGC 2700 

TATTCATTCT TAAGTCATCA ACGAGTAAGT AATGACTTTC TAAATGGAAA ATACTTATCC 2760 

TAGTCTTTTT AATTTTGGAA TTGTTACGTA TTTCTGACAA TTTAGAATTC GCATTCAAAA 2820 

15 AATATCTAAA TAAATAACAC GCAATAAGTT GATTGATGTA ACATGTAAGA GAATGTTTTA 2880 

AATAAACTTT ATTTAAAAGG CAATGAAATA ATAAATGGCA AGGCTATTAA TAAAGACTTT 2940 

TAGTAATTAA TTTAAAAAAG AGGTATTCTA ATTAACAGGT TTTCCGATTA GTTACAATTA 3000 

TTTAATTCTC AAAAGATTTA GAATTGATTA TCAAATTACT GTAAGCCCTT TGCTGTATAT 3060 

GCTACAATTC TTATTGATGG AGGGTAAATG TATTGAATCA TATTCAAGAT GCGTTTTTAA 3120 

ATACATTGAA AGTTGAACGG AATTTTTCGG AACACACATT GAAATCATAT CAAGATGACT 3180 

TAATTCAGTT TAATCAATTT TTAGAACAAG AACATTTAGA GTTGAATACT TTTGAATACA 3240 

GAGATGCTAG AAATTATTTG AGCTATTTAT ATTCAAATCA TTTGAAAAGA ACATCTGTTT 3300 

30 CTCGTAAAAT CTCAACGTTA AGAACTTTCT ATGAATATTG GATGACGCTT GATGAGAACA 3360 

TTATTAATCC ATTTGTTCAA TTAGTACATC CGAAAAAAGA AAAATATCTT CCGCAATTCT 3420 

TTTACGAAGA AGAAATGGAA GCGTTATTCA AAACTGTAGA AGAGGACACT TCAAAAAATT 3480 

35 TACGGGATCG AGTTATTCTT GAATTGTTGT ATGCTACAGG CATCCGTGTT TCGGAATTAG 3540 

TAAATATTAA AAAACAAGAT ATAGATTTTT ACGCGAATGG TGTTACCGTA TTAGGAAAAG 3600 

GGAQCAAAGA GCGCTTTGTA CCGTTTGGTG CTTATTGTAG ACAAAGCATC GAAAATTATT 3660 

TAGAACATTT CAAACCAATT CAGTCATGCA ATCATGATTT TCTTATTGTA AATATGAAGG 3720 

GTGAAGCAAT CACTGAACGC GGTGTACGAT ATGTTTTAAA TGATATTGTT AAACGAACAG 3780 

CAGGCGTAAG TGaGATTCAT CCCCACAAGC TCAGACATAC ATTTGCAACG CATTTATTGA 3840 

ATCAAGGTGC AGACCTAAGA ACAGTACAAT CGTTATTAGG TCATGTTAAT TTGTCAACAA 3900 

CTGGTAAATA TACACACGTA TCTAACCAAC AATTAAGAAA AGTGTATCTA AATGCACATC 3960 

50 CTCGAGCGAA AAAGGAGAAT GAAACATGAG TAATACAACA TTACATGCAA CAACAATTTA 4020 

TGCTGTAAGA CATAATGGGA AAGCAGCTAT GGCTGGAGAT GGGCAAGTAA CGCTTGGTCA 4080 

ACAAGTCATC ATGAAACAAA CGGCAAGAAA AGTGCGACGT TTATATGAAG GTAAAGTGTT 414 0 
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ATTACAACAG TTTAGTGGTA ACTTAGAAAG 


AGCTGCTGTT 


GAATTGGCAC 


AAGAATGGCG 


4260 


5 


AGGCGATAAA CAATTAPfiTP 


AATTAGAAGC 


TATGCTAATT 


GTAATGGATA 


AAGATGCTAT 


4320 




TTTAGTTGTC AGTGGAAPTf: 


GCGAAGTTAT 


TGCTCCAGAT 


GATGACCTTA 


TCGCTATTGG 


4380 




ATCAGGAGGC AAPTAPGPAT 


TAAGCGCAGG ACGTGCATTG 


AAACGCCATG 


CATCGCATTT 


4440 


10 


gtptgptgaa fiaaaTfifiraT 


ATGAGAGCTT 


GAAAGTAGCG 


GCTGATATTT 


GTGTCTTTAC 


4500 




PA APR AT A AT ATTGTTGTPP 


AAACACTATA ATAATCAGAG 


CACGATAAAT 


AATTACGAGC 


4560 






CGGAGGAATG 


AAATTAATGG 


ATACAGCTGG AATAAGATTA 


4620 


15 


AUiV-UAAAAl? AAAIwGTATC 


TAAATTAAAT 


GAATACATCG 


TTGGACAAAA 


TGATGCTAAA 


4680 




t-tsTAAAGTGG CAATTGCCCT 


ACGTAATCGA 


TACAGAAGAA 


GTTTATTAGA 


TGAGGAATCA 


4740 


20 


AAvjLAAVjAAA 1 rTCACL IAA 


AAATATTTTG 


ATGATTGGAC 


CAACCGGCGT 


TGGTAAAACT 


4800 


vjAAAI IAjLAA gaagaatggc 


CAAAGTTGTC 


GGCGCGCCAT 


TTATAAAAGT 


AGAAGCTACT 


4860 




AAATTTACTG AGGTAGGTTA 


TGTAGGACGA 


GATGTTGAAA 


GTATGGTTAG 


AGATCTTGTT 


4920 


25 


v?AJ. Ij ill LAtj i AAvjA ii AGT 


CAAGGCGCAG 


AAAAAATCAT 


TGGTACAAGA 


TGAAGCAACA 


4980 




PPTMi/Vr'PR TV nV~" 7V 7\ 7\ 7V 7V /""T 1 

IAHWjCUA A IbAAAAAL J. 


TGTTAAGTTA 


TTAGTTCCAA 


GTATGAAAAA 


GAAAGCGTCT 


5040 




V~ri/\H^,w4AIA A I 1 1 1 ALiA 


GTCACTTTTC 


GGAGGTGCAA 


TTCCAAATTT 


CGGACAAAAT 


5100 


30 


AACGAAGATG AAGAAGAACC 


ACCTACTGAG 


GAAATTAAAA 


CAAAACGTTC 


TGAAATTAAG 


5160 




AGACAGCTAG AAGAAGGCAA 


ACTTGAAAAA 


GAAAAGGTAA 


GAATTAAAGT 


CGAACAAGAT 


5220 




CCTGGTGCTT TAGGTATGCT 


AGGTACAAAT 


CAAAATCAGC 


AAATGCAAGA 


GATGATGAAT 


5280 


35 


CAATTAATGC CTAAAAAGAA 


AGTTGAGCGA 


GAAGTTGCTG 


TTGAGACGGC AAGGAAAATC 


5340 




TTAGCTGATA GTTATGCGGA 


TGAACTAATT 


GATCAAGAAA 


GCGCTAACCA 


AGAAGCGCTT 


5400 


40 


GAATTAGCAG AACAAATGGG TATCATCTTT ATAGATGAAA TCGACAAAGT 


TGCGACGAAT 


5460 




AATCATAATA GTGGTCAAGA 


TGTCTCAAGA 


CAAGGTGTTC 


AAAGAGATAT 


TTTACCTATA 


5520 




CTTGAAGGTA GCGTTATTCA AACCAAATAT 


GGTACTGTGA ATACTGAACA 


TATGCTGTTT 


5580 


45 


ATAGGTGCTG GAGCTTTCCA 


TGTATCTAAG 


CCGAGTGACT 


TGATACCAGA 


ATTGCAAGGT 


5640 




CGTTTTCCGA TTAGAGTTGA ACTTGATAGT 


TTATCGGTAG 


AAGATTTTGT 


AAGAATTTTG 


5700 




ACAGAACCAA AATTGTCATT 


AATTAAACAA 


TATGAAGCAT 


TGCTTCAAAC 


AGAAGAAGTT 


5760 


SO 


ACTGTAAACT TTACCGATGA 


AGCAATTACT 


CGCTTAGCTG 


AGATTGCTTA 


TCAAGTAAAT 


5820 




CAAGATACAG ACAACATTGG 


TGCACGTCGA 


CTTCATACAA 


TTTTAGAAAA 


GATGCTAGAA 


5880 




GATTTATCAT TCGAAGCACC 


AAGTATGCCG 


AATGCAGTTG 


TAGATATTAC 


CCCACAATAT 


5940 
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AAATATACAA AAGGAGAAAA ATTCATGAGC TTATTATCTA AAACGAGAGA GTTAAACACG 6060 

TTACTTCAAA AACACAAAGG TATTGCGGTT GATTTTAAAG ATGTAGCACA AACGATTAGT 6120 

AGCGTAACTG TAACAAATGT ATTTATTGTA TCGCGTCGAG GTAAAATTTT AGGATCGAGT 6180 

CTAAATGAAT TATTAAAAAG TCAAAGAATT ATTCAAATGT TGGAAGAAAG ACATATTCCA 6240 

AGTGAATATA CAGAACGATT AATGGAAGTT AAACAAACAG AATCAAATAT TGATATCGAC 6300 

AATGTATTAA CAGTATTCCC ACCTGAAAAC AGAGAATTAT TCATAGATAG TCGTACAACT 6360 

ATCTTCCCAA TTTTAGGTGG AGGGGAAAGA TTAGGTACAT TAGTACTTGG TCnAGTACAT 6420 

15 GATGATTTTA ATGaAAATGA TTTGGTACTA GGTGAATATG CTGCTACAGT TATTGGTATG 6480 

GAAaTCTTAC GTGAGAAGCA TAGTGAAGTA GAAAnAGAAG CGCGCGATAA AGCTGCTATT 6540 

ACAATGGCAA TTAATTCATT ATCTTATTCT GAAAAAGAAG CGATTGAACA TATCTTTGAA 6600 

GAACTTGGCG GTACGGAAGG CCTATTAATC GCATCAAAAG TTGCAGATAG AGTTGGTATT 6660 

ACTAGATCTG TAATTGTAAA TGCACTACGT AAATTAGAAA GTGCTGGTGT AATTGAATCA 6720 

CGTTCTTTAG GAATGAAAGG TACTTTCATT AAAGTTAAAA AAGAAAAATT CTTAGATGAA 6780 

TTAGAAAAAA GTAAAT 6796 
(2) INFORMATION FOR SEQ ID NO: 3: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATCCTAAAAT TnAAAATTAT CACGCCTTTT GaACAGCTTT GTAACCaTCt GG ACGAT CAT 60 

kAAATTCCaA TGTAAATCCT GGTTTAAaGT TGATCTTTAA CCTTATTTAA AyCACCAATT 120 

GTACGTATAT TATGTTGTTT AGCAAAATCA CGTTTTACAG CTAAAGCATA CGTATTGTTA 180 

TACTTCATTG GTTTTAACAT AGTCATTTGA TATTTCTTTT CAAGACTTTG CTTAGCTTGT 240 

TCATAAACTT TTTTCTCTTC TTTTGACTTC AATGGTTCTT TTGTTAATTC ACCTAAAACT 300 

GTTCCAGTAA ATTCTAAATA CCCATCTATA TCGTCAGATT TTAAAGCATT AAATAAAAAT 360 

50 GCTGTTTTGC CCATACCATC TTTCACTTCT ACAGTATTTT TGGTCTCTTC TTCTATTAAA 420 

ATTTTATACA TATTTGTAAT AATCGATGGC TCGGAGCCAA GCTTTCCAGC TAACGTAATT 480 

TTATCACCTT TTTGTGCAAA CATAGGAATA GCGATAGCCA GTATAATAAT CATCACTATA 540 
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TCAAATATAA TTGCCAATAA GGCTGCTGGA ATTGCACCTA ATAATATCAA CGATGCATTG 660 

TTACGGTCTA TACCTAATAA AATTAAATCT CCTAGTCCGC CTGCACCAAT TAATGCTGCT 720 

AGTGTTGCTG TACCTATAAT TAATACCATA GCCGTTCTTA CACCAGCCAT TATAACAGGC 780 

ATTGCTATCG GAAGTTCGAC TTTAGTTAAA CGTCTAAATG GTTTCATACC TATACCTTTA 64 0 

GCCGCTTCAA TGAGTGATGG ATCAACTTCT TTAATTCCAG TATACGTATT CCTTAAAATT 900 

GGTAACAACG CATACACTAC AAGTGCAATA ATTGCTGGCA CACGACCGAT ACCAAATAAA 960 

GGAATCATTA AACCTAATAA TGCCAACGAT GGTATGGTTT GAAGAATTGC CGCAATATTC 1020 

ATTACGATTT CAGATATCGT TTTAGTCTTC GTTAATAAAA TACCTAATGG TACCGCAATA 1080 

GCAGTTGCAA TCAATAATGC GATAAATGAT ATTTGAATAT GTTCTATCAT TGTCGAAAAG 114 0 

AGTTGCCCCT TACGTTCACT CAATATGTCg AAAAAGTTAG TCATGTTGAG CTACCTCCTT 1200 

20 TTTCTGGGAC AAATATTTGA AGATATCTTT CCTATCAATA ACATATTGAC CTACGCTATC 1260 

TTCTTGCATG ACAATGACAC GCTCGCTCTC TGATAAAAGT TGATACAATA CTTCAATTGG 1320 

TTGATTGTCA TAAACAATTG GATAAGCGCT CATAGATGTA ACCTCATCGA TTGGTTTCAT 1380 

AATATCCAAG TCACGGATAA TTGCGTTCTC TTCAACACAT GGCGCATCAT CTTCTAAATG 144 0 

ACTACCCATA AATTGTTTAA CAAATTCACT TTGAGGATTA TTTTTAAATC CTTCTGGTGT 1500 

GTCAATTTGT TCAATATGCC CTTCATTCAA AAGACAAATC TTATCACCAA GTTTCATCGC 1560 

CTCTTGAATA TCATGTGTAA CAAATATGAT TGTCTTCTTA ATTTTAGTTT GTAATTCAAT 1620 

TAAATCATCT TGAAGTTTTT CTCGGCTGAT TGGGTCTAAT GCACTAAACG GTTCATCCAT 1680 

35 TAAAATAACT GGTGGATCAG CTGCTAACGC ACGTATAACT CCTACACGTT GTCGTTGCCC 1740 

CCCTGACAAT TCATCAGGTT TTCTGTTTTT ATATTTTTCA GGTTCTAATC CAACCATTTC 1800 

AAGTAATTCA TCTACTCTTT TATCTATATC TTTTTCTTTC CACTTTTTCA TTTGTGGCAC 1860 

TTGTGCAAtA TTTTCTTTGa wTGTCaTATG TGGGAATAAT GCAATCTGCT GcAATACGTA 1920 

TCCAATATCC CAACkCATTT CGTATACTGG ATAATCACTT ATTGGTTTAT CTTTAAAATA 1980 

AATATAACCT TCACTTAAGT GAATGAGTCG ATTAATCATT TTTAATGTCG TAGTTTTTCC 2040 

ACAACCTGAA GGTCCAATTA GCACAAAAAA TTC 2073 
(2) INFORMATION FOR SEQ ID NO: 4: 

so (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 4: 

ACTATTCTAG CTTCATCAGT TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG 60 

AGTATTTACT ACATAATGAT ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 120 

TGAGCCTATC ATAACTTCCT TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 180 

TACATCTTAT CAAACATCAA TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 240 

TTTTGCTTGT ATTCTAAATT AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 300 

AATAGCATTT GGGTGCCACC TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 360 

GTTTCACTTA GTCCAGGCAT ACCGATAGTC ATCTTAACGT ATTCATCCAT AACTAAAGAT 4 20 

TCATAAATGC CTTCAATCAC ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 480 

TTACCGCCTT CTTTAACGTC CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 540 

CATTGTTTCA ACAAATCTTT CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 600 

TAAAGCTTTT CAATTTCAAC TTCGACATGT TCATTCTCTA CATTAAATTT TGCCACTGTT 660 

GTCCACCCAC TTTCGCTCTT ACTTTTATTT TAACGTATTT TTGCTCAGTT CCAAACATAG 720 

ATGATCATCA TTTTTAAAAG ATTAGCGTTA TACGGTGAGT ACAACATGAT CTGTTAATAT 780 

AACAAGCCAC CTTACTTGGC TACATCGATA TATTGTTAAG CATTAATGTT TCATTTCTTG 84 0 

ACTAGTGTTC TTTTTTAGCT TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 900 

TAATAATATA GGATAAATGC TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 960 

AGCCGCTGaA ATGACTTGCG GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 1020 

ATCAAGAATA CTTGCTGATT TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 1080 

TAAAGGACCT GACATAATAA TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 1140 

TACC&AACTA GCAATTCCTA ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 1200 

TTGCAACAAC CATTCAATAC CACCATTGTG TTGAATAATA CCGACTAAAC CACCAATTAG 1260 

CAACGCAATC ATAGCAATAT CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 1320 

CCCCATCCAA CCGAATGAAC CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 1380 

AATCAATACG ATAATGACAT TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 1440 

TACAACTTTA ATTAGATTAT AATCATAGTt TTTAGCATGA TTTAAAGAAA TGCCATTCGT 1500 

TAAGAAATAC AGAATAATAA TCGTTAAAAT AGCACCTGGC AATACAATTT TAAAGTTTAC 1560 

TCTGAATTTA TCTTTCATTT TCGTATGTTG TGTTCTAACC GCAGCAATTG TTGTATCTGA 1620 

AATCATTGAT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGTAGCCa tTGCtAGCGC 1680 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA ATCACAAACA ATCCTACAAT ' 1800 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA CGCCACCCAT 1860 

5 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 1920 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 1980 

1Q ATTCATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 204 0 

AAAATCACCT GTGATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 2100 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 2160 

is AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGCTTTA CCTGCTTATC 2220 

TTCAAACCAT TACGGTTACT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 228 0 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 2340 

20 

CTTAGTACGT CAATCATAAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 2400 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 2460 

ATTAATGCAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 2520 

25 

' AATTGCTGAA CACGTTGCTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 2580 

TTTATACTGA TTCCCAGTAT ATTTTCTAAT ATTTGAACCA ATACGATACT TATTGCAAAT 2 64 0 

30 ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 2700 

TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGACGT ATTTGTTAAA CTTAATGATA 2760 

TGCTTTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 2820 

35 AATGTGAATA CTATATATAA TGGTAATTTT TGTTCAGGAA AAACAGTCGC TATTCCAAAA 2880 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATGCCAT AATAACAACC 2940 

CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTATCAG CGTTTCTATG ATCAGTACTT 3000 

40 

CTACGGGTAG CGTTTCTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 3060 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 3120 

45 TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCGCAA 3180 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTGCTTA 3240 

AAACTTATAA TCCACACCCT GAGCAAACGC TCCTTATGAC AGAGTATTAA AATAAGCCGA 3300 

50 TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 3360 

TGGATTGAAG TAACTAAATT AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 3420 

TCTATTTGCA CTCCACCGCT AACACCGAAC ACTTGTCCGG TTGTATAACT TGATTCTTCT 3480 
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GTTTTTTGAC CAAATGTTGG GATTTTACTT TGAGGTTGTC CACCAGAAAT TTGTAATGGT 3 600 

GACCAGAATG GACCAGGCGC TACACAGTTC ACTCTAATTC CTTTTGGTCC TAATTCTTCT 3 660 

GAAAAACTTT TAGTTAATGA AATAATTGCT GCTTTTGAAG CGGCATAATC ATGAAGAATA 3720 

GGACTAGGAT TATAACCTTG TACAGATGAT GTCGTTGTAA TTGACGCACC CGGTTTTAAA 3 780 

TATTCCAATG CTTTTTGAAC TGTCCAAAAT AGCGGATAGA CATTCGTTTC AAATGTTTCT 3B40 

GTAAATGCCT CAGTTGTAAA TCCATGAATA TCATCATGAT ACTGTTGATG TCCAGCAACT 3 900 

AAAGTAACAT TATCTAAGCC ACCTAATTGT TGATATGCTT GTTCAACAAG GTCATAGTTG 3 960 

AACTGTTCAT CTCTTATATC ACCAGGAATT AACACTGCCT TTTGACCACT TTCTTCAATC 4020 

ACTTGGCGTA CTTCTTGTGC ATCTTGTTCT TCACTCGGAA GATAGTTAAT CGCTACATCT 4080 

GCACCTTCTT TAGCATACGC AATTGCTGCT GCACGCCCTA TTGCTGAGTC ACCACCTGTG 4140 

20 ACTAATATTT TATAGCCTTG TAAGCGTTGA TGACCTTGGT AAGACGTTTC GCCACAATCG 4200 

GGTGCTGGCG TCATTTCAGA TTGTAAACCC GGTACCTCTT GTTCTTGTTT TTCATAATCC 4260 

GTTGTTTTAA ATTTTGTTCT AGGATCTTGA GCTGCCATTT TTTTACATCT CCTTATTCGC 4320 

TTAATGGTTA TTATTTACCC AATCTTCCTA GGAACTTAAT CATGATTACA CTAAAAATTA 4 3 80 

CTTTCTTCTT TATAAAAACA AGCTCGAATT ATTCATGCAA TAGTCTCTTT ACAAATTCAA 444 0 
CAAAATACTC AGGTACTTTT TCCAGAATCC TTTCATCCGG TTTATATTGA GGATGATGTA . 4500 

AATCATATTC ACTATGAGAA CCAATTAA CG CAAATACACT TGGAAAATGT TGACTATAAC 4 560 

CTGAAAAATC TTCTCCAATC GTAAGCGGCT GTTCCATCAT TCCCACCTTA TATCCAACAT 4 620 

35 GTTGGGCTAC TGCAATTGCT TTATGCGTCA ATGCCTCATC ATTCATCACA GCGCCAGGTA 4680 

AATGCGTATA ATTTAAATTA ATTTTCATAT TATATGCTTG AGCCAATCCG TCCGCAATAT 4 740 

CTTGJAATCG TGTTTCTACA AGCTTTCGTA CCACAGGATC AAAACTACGC ACTGTGCCTT 4 800 

GTACATACGC ATGATCAGCA ATGACATTCC AAGTATTACC ACATGATATT TGTCCAATTG 4 860 

TTACTACCGC TTCATCAAAC GCAGATAGAT TTCTACTAAC TATGGATTGA ATACTATTAA 4 920 

TCAATTGCGC CAACACAATA ACTGGATCGT TGCATTGTTC TGGcTTTGCA GCATGACCAC 4 980 

CCACGCCTTT AATATGAAAC TCAAAACGAT CTACTGCTGA TGTAATTGCC CCTGTTTTGA 504 0 

TTGCAAATGT ACCTACCGAA CGCGATGGGT CATTATGAAA ACCCAATACT GCTTGTACAT 5100 

SO CTTTTAATGC ATGTGTTTCA ATAATTTTAA AAGCGCCATG TCCTAGTTCT TCTGCTGATT 5160 

GAAAAATGAA TTTAACACGC CCAGTAAGAG TGCCCTCAAT TTCTTTTAAT TTTACAGCTG 5220 

TAGCCAAAAT ACTAGCCATG TGAATATCAT GACCACACGC ATGCATAACA CCTTCATTTT 5280 
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CAGCTATACA ACTCAGACCT TGTCCCACTT CAGCAACAAG CCCAGTCGCA AGTGGTAAGT 5400 

CTAATATTCT AATATGATGT TCTGTTAAAA TATCTTTAAT TTTTTGTGTA GTCTTAAATT 5460 

CTTTATCGGA TAGTTCTGGA AATTGATGAA AATACCTTCT CCAGGTAACA GCTTGATCTT 5520 

TTAATCCCAT CGGTCATTCC CCTTCCTTAA GTCAATGATA TGTTGTCTAC CCTACGATGA 5530 

TCATCTTTGA CTATTAAACG ATGATTTCAC AACAATGTAC TCTTGTTAAT TGCTTTCGTT 5640 

AATGATAGAC AGTTGTTTAA TAATATCGTA ACACTGTTGT CAAACTATTC TAACTTTTAT 5700 

AATTGAGACT CTATACAAAA ACGTGTTCTC GAATATACTT GTTTTTACAA ACCACAAAAA 5760 

15 GCTCTAAACA TTAGTTTAAA CCAATGCTTA GAGCTTTCTA ATTATTTTAT GCTTTAAAAG 5820 

ATACTGTGTT ATCTACGATG ACCTTACCGT CTTTAATAAC TTTTTCTGCG TGATTGATAC 5880 

CAAAATGATA TGGAATATAT TCATGATTTG GTGCATCCCA AATTACTAAA TTAGCCTTAT 5940 

CACCTGTGTT AATTGTACCC GCGTTAATGT CTATTGCTTT AGCAGCATTG ACCGTAACAG 6000 

CATTCCAAAC TTCATTAGGT GATAGCTTTA ATTTCAAGGC TGCAATCGCC ATAACAAGTT 6060 

GTAAGTTGTT TGTGACACTA CTACCAGGGT TATAATCAGT TGCTAATGCA ATCGCACCGT 6120 

TATTGTCAAG CATGCCTCTT GCATCTGCAT AATCTTCTTT ACCTAAATAG AACGTCGTTG 6180 

CAGGTAAGAG GACAGCTACA GTATCACTAT TTCGCAACTT TTCTTTTCCT TTATCACTAG 6240 

30 AAGCTACTAA GTGGTCTGCT GATATTGCTT GTTCATCAAT TGCTAATTCC AGTCCGCCTA 6300 

ACGGATCAAT TTCATCCGCA TGTATTTTCA CTTTAAAACC TGCTTCTTTG GCTTTTTGCA 6360 

TATAATGTTG CGATTGTTCT ATTGTAAATA CACCTGTTTC ACAGAAAATA TCCGCAAAGT 6420 

CTGCATATTG TTTTACTTCC GGAAGTAACG CAATCATTTC TTCTAAAAAT GCCTCATTTG 6480 

AACTTGCCTC TTTAGGTACA GCATGAGGCC CTAGGAAAGT ATGTTTCATG TCTAAATCAT 6540 

ATTTCTCAGC TAAACGATTA GACACTTTCA ATTGCTTCAG TTCATTTTCT CTATCTAATC 6600 

CATAACCACT CTTACTTTCA ACTGCAAGCA CGCCGTGTTT AATCATAGTA AGCAAATCAT 6660 

GCTCTGCTTT TTTAAACAAG TCATCTTCGG ATGTTTCTCT AGTAGCATTA ACGGTAGATA 6720 

45 ATATGCCACC ACCCATTTCT AATATTTCAA GGTAAGACTT ACCTTGACGT TTTAATGACA 67 80 

TCTCATGTTC TCGAGATCCA CCAAATGTTA AATGGGTATG TGCATCTACT AATGCTGGGG 6840 

ACACTACCTT CCCACTAGCA TCAATCGTCT CAGTCGCATC GTAGTCATCT GTATGTGTTC 6900 

50 CAGCATATAC AATTTTGCCA TCTTTAATGA CAACTGTACC ATTTTTCACA ACATTTAATT 6960 

CATCTAATTC CTTACCCTTC AAAGGTTTAT CTGTTGATCT CGGTAAAATT AATTCTGCTA 7020 

TATGATTAAT TATTAAATCA TTCATTACTT ATCACCTGCT TTATCAATCA TTGGAATATG 7080 
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AACACCCATA CCTGGGTCAG TCGTCAATAC ACGTTCCAAT CTTCTTTCAG CACGCTCTGA 7200 

TCCATCTGCT ACAACAACCA TACCCGCATG AAGTGAATAT CCCATGCCAA CACCGCCACC 7260 

GTGATGGAAT GAAATCCATG AACCACCTGC AGCTGTGTTA ATGAGTGCAT TCAATACAGC 7320 

CCAATCACCA ACCGCGTCAC TACCATCTTT CATACTTTCT GTTTCACGGT TAGGACTAGC 7380 

AACTGAACCA GCATCTAAAT GGTCTCGTCC AATAACAATT GGTGCTGAAA TTTCACCGTC 7440 

ACGTACAAGA CGATTTAAAG CTAAGCCCAT TTTCGCTCTT TCTCCATAGC CTAACCAAGC 7500 

AATACGTGAT GGTAGTCCTT GATATGAAAT TTTTTCTTCA GCTAAATCAA GCCATCTTAA 7560 

15 TAACTTTTCA TTTTCTGGGA AAAGTTTGCG CATTTCTTCA TCCGCACGCT CGATATCTTT 7620 

TGGATCACCA CTCAACGCAG CAAAGCGGAA TGGCCCTTTA CCTTCACAGA ATAATGGTCT 7680 

AATGTAAGCT GGTACAAAGC CTGGGAAGTC AAAAGCATTT TTCACTCCGT TATTGAAGGC 7740 

TACTTGACGA ATATTGTTAC CATAATCAAA TGCTACAGCG CCACGTTTTT GGAATTCAAG 7800 

CATTAATTCA ACATGCTTTG CCATTGAAGC TTGTGACAGT TCAACATATT TTTTCGGATC 7860 

TTTTTCACGC AATACTTTCG CTTCTTCTAC AGAGTATCCT TGTGGCACAT ATCCATTTAG 7920 

CGGATCATGT GCACTTGTTT GGTCAGTAAT AATGTCAATT TTAAATCCTT TTTCTAGAAT 7980 

CGCTTGATGG ATGTCTACAG CATTTCCAAC TAACCCGATT GATAATCCTT CTCCACGTTC 8040 

TTTCGCCTCT TCTGCTAATT TTAATGCTTC ATCTAAATCA GCTGTTTTAA CATCACAGTA 8100 

TTTCGTATCA ATTCGCTTAT CAACACGTGT TTCATCAACA TCCACGCAAA TTGCTACCCC 8160 

ATGATTCATA GTAATTGCTA ACGGTTGCGC ACCACCCATA CCACCTAAAC CTGCTGTCAG 8220 

35 TGTAACAGTG CCTGCTAAAT CTCCATTAAA GTGTTGATTA CCTAGCTCGG CAAATGTCTC 8280 

ATAAGTACCT TGCACAATAC CTTGAGAACC AATATATATC CAACTACCGG CTGTCATCTG 8340 

TCCATACATG ATTAAACCTT TTTTATCTAA TTCATTAAAA TGATCCCAGT TTGCCCATTC 8400 

AGGCACTAAT ACTGAATTTG AAATTAATAC ACGTGGCGCT TCTTCATGTG TTTTAAATAC 8460 

AGCAACTGGC TTTCCTGATT GTACTAACAT TGTCTCATCT GATTCTAATT CTCGTAACGT 8520 

TTTCTCTATT GCTTCAAAAG CTTCCCAATT ACGTGCTGCT TTTCCAATAC CACCATAAAC 858 0 

AACTAAATCT TCTGGTCTTT CAGCAACTTC TGGGTCTAAA TTGTTGTATA ACATTCTAAG 864 0 

TACTGCTTCT TGTTCCCAAC CTTTACACTC AATACTCAAA CCTTTTTTTG CTTGAATTTT 8700 

50 TCTCATAAAA TTCGCTCCTG TTCTTTTAAG AAGTTAATTC CACTAAATTT AAAACGCTTA 8760 

CATTATTATC TTCAATATTC ATTATAGTAT GTTAAAATAT AGCCAACAAA TATAAATAAA 8820 

CTAATTATCC ATAGCTTGAA TCTATAAATA AAAGGAGCAA AACACATGAA AATTATTCAG 8880 

55 



20 



25 



30 



40 



45 



230 



EP0 786 519 A2 



CATATTAGCC AGCCATCTTT AACTGCTACG ATTAAAAAAA TGGAAGCAGA TTTAGGTTAT 9000 

GACTTATTTA CACGTTCAAC AAAAGACATC AAGATTACCG AAAAAGGAAT ACAGTTTTAT 9060 

5 

CGTTATGCGA GCGAATTAGT TCAACAATAT CGATCCACGA TGGAAAAAAT GTATGATTTA 9120 

AGCGTTACAT CAGAACCAAG GATAAAAATT GGGACTCTTG AATCTACGAA TCAATGGATT 9180 

10 GCGAATTTAA TTCGAAAGCA CCATTCCGAC TACCCTGAAC AGCAATATCG TTTATATGAA 9240 

ATACATGATA AACATCAATC TATAGAGCAA TTACTGAATT TTAATATTCA TTTAGCTATA 9300 

ACAAATGAAA AAATAACCCA CGAAGATATA AGATCCATTC CTTTATATGA GGAATCTTAC 9360 

15 ATTTTATTAG CACCCAAGGA AACATTTAAA AATCAAAATT GGGTAGATGT TGAAAATTTG 9420 

CCACTCATAT TACCAAACAA AAATTCTCAA GTGCGCAAAC ACTTAGATGA CTATTTTAAT 9480 

AGAAGAAATA TTCGTCCAAA TGTCGTTGTA GAAACAGATC GATTCGAATC AGCAGTTGGA 9540 

20 

TTTGTTCATC TCGGCTTAGG TTACGCTATC ATTCCGAGAT TTTATTACCA ATCATTTCAC 9600 

ACGTCTAATT TAGAATATAA AAAAATTCGT CCAAACTTAG GCCGAAAAAT TTATATCAAT 9660 

2S TACCATAAAA AACGCAAACA CTCCGAACAA GTACATACAT TCGTACAACA ATGCCAAGAT 9720 

TATTTATATG GACTTTTAGA GGCTCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9780 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9840 

30 CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9900 

CTCAGTCAAC TGTATACCTT TTTCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9960 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGTGCCTCT TATGTAGTTG 10020 

35 

CGTAGTCAaC TGTaTACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 10080 

CGCAGATCAT CGTATAAAAA TTAATGACGT CATTTCAAAA ATCGATACAA AAATAATTTA 10140 

TTATAAAAAT TCTAAGAAAG AAGTGAAGCA GATGTTAAAA TCTATTAATC ATATATGCTT 10200 

40 

TTCAGTCAGA AATTTAAACG ATTCAATACA TTTTTATAGA GATATTTTAC TTGGGAAATT 10260 

GCTATTGACT GGTAAAAAAA CTGCTTATTT TGAGCTTGCA GGCCTATGGA TTGCTTTAAA 10320 

45 TGAAGAAAAA GATATACCAC GTAATGAAAT TCACTTTTCA TATACACATA TAGCTTTCAC 10380 

TATAGATGAC AGCGAATTTA AATATTGGCA TCAGAGGTTA AAAGATAATA ACGTGAATAT 10440 

TTTAGAAGGA AGAGTTAGAG ATATTAGAGA TAGACAATCA ATTTACTTTA CCGACCCTGA 10500 

SO 

TGGTCATAAG CTAGAATTAC ATACTGGCAC ACTTGAGAAC AGATTAAATT ATTATAAAGA 10560 

GGCTAAACCA CATATGACAT TTTACAAATA AGGTGTCATT ATAAAAAGGC CTCTTGAACT 10620 

CCGTTAAAAT TTTAATTAAT TATTATATAA TAAGAGAACT TTTCAAACAA TACAGTTGTT 10680 
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TTACTGCAAT TATTTTTCAA ATATATCAAC GTTAATATAA CTTCTATTAA GAAATACTCA 10800 

CATTCTGCCC TGCAATGCAA ATCTCGTCAC ATATAAATAT TTTTAATTAT TTTAAAAAAT 10860 

5 

GATGCACTAA ATTAGCAACG AGCTTAGCAG TTCTATTGTC AGCGTCATAT GTTGGATTCA 10920 

TCTCAGCAAT ACTAACTGAA GACACCTTAT CACTTGGAAT AATACGTTTT GCTAATTCAA 10980 

1Q GAACAGTATG TGGATACAAA CCTAACACTG CCGGCGCACT TACCCCAGGC GCAAACGCAC 11040 

TATCAATGAC ATCCATACAA ATCGTAAACA TAATGACATC ATGTTCATGT ACAAAACGTT 11100 

CAATCATATC TTTAATTGTT GGTGATACGT GACTCAATAA TTCATCTGCA AAGACATAAT 11160 

15 CAATCTTTTT CTCTTTAGCA TAATCAAATA AACTTTGCGT ATTACCACCT TGAGCAATAC 11220 

CAAGCACTAA ATAATCTGTG TTTTCATCTT CTTCTAAAAT TTGTCTAAAG CTCGTTCCAG 11280 

ATGTAGATTG TTGTTCAGCA CGTGTATCAA AATGCGCATC AATATTTATC ACACCAATAG 11340 

20 

ATTGTGTTGG ATAGACTTTA CGTGTTGCTA AATATTGAGC ATACGCAATA TCATGTCCAC 114 00 

CACCTAATAA AAATGTTTGT CTATGATTAG CAATTGACTT CGCTGCAAGC ATAGCAAATT 11460 

CTTTTTGAGT ATCAATTAAT TCCTCATGAT CATGATAAAC ATTTCCGTAA TCGACTAAAG 11520 

25 

TTcACATTGA TTCAAATCCG GCAAACCTGC AAATGCTTGT TTAATCGCAT CTGGTCCTTC 11580 

TTTTGCACCA ATGCGCCCCT TGTTTAAAGC AACACCTTTG TCAACAGCAT AGCCTAATAT 11640 

30 ACCGACCCCT GATGGCATAC TACTCTTTTC CAGCTTAGAC AAATCTTCAA ATGTTACTGT 11700 

TTGAAAATGT CTAAATTTTT TCGGGTCTGT TTCACTATCT AACCTTCCAG TCCATAAATT 11760 

TGGTTCACCT TGCTTGTACA CAGCATTTCC CCCTCTTATT TATGTGGCTT ATTAACAATT 11820 

35 AAAGTATAAC GTATAGGAAA TTTTGAATTC AATTCATAGT TAAATCCGTA TCTTAAAAAT 11880 

ACTTATCTAC ATTACTTTTA CCCCTATTTT CTATGTAATA ACGAATACTT AGCTGATTTA 11940 

TGTTAATAAA ATACGTCAAG ACTATTACAT TTTCATTAAT ATTGACATAG ACAATTTATC 12000 

40 

TCTCGGCTTG TAATATGTAT AATTGTTACT AAAAGATATT TTGCTTGTTA CCTAATGGAG 12060 

GTTACATATA ATGAAGAACA ATAAAATTTC TGGTTTTCAA TGGGCAATGA CGATTTTCGT 12120' 

4S CTTCTTTGTC ATTACAATGG CGTTATCCAT TATGCTCAGA GATTTCCAGT CTATAATTGG 12180 

TGTCAAACAC TTTATATTTG AAGTTACAGA TCTAGCACCA TTAATTGCTG CAATCATTTG 1224 0 
TATACTCGTT TTCAAATATA AAAAGGTCCA ACTTGCAGGT TTAAAATTCT CAATCAGCCT . 123 00 

50 ' GAAAGTAATT GAACGTCTAT TGCTAGCTTT AATTTTACCT TTAATTATTC TAATTATTGG ,123 6 0 

TATGTACAGC TTTAATACAT TTGCAGATAG CTTTATTTTA TTACAATCAA CAGGCTTATC 12420 

AGTACCTATT ACACACATTC TGATTGGACA TATTCTGATG GCGTTCGTAG TAGAATTCGG 12480 
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TGTTGTTGGT TTGATGTATT CAGTTTTCTC AGCAAATACA ACTTATGGTA CAGAATTTGC 12600 

TGCTTATAAC TTCCTTTATA CATTCTCATT CTCTATGATT CTTGGTGAAT TAATTAGAGC 12660 

5 • 

GACTAAAGGA CGTACAATTT ATATTGCAAC GACATTCCAT GCTTCAATGA CATTCGGACT 12720 

TATTTTCTTG TTTAGCGAAG AAATCGGCGA TCTATTTTCA ATCAAAGTCA TCGCCATTTC 12780 

AACAGCAATC GTTGCAGTAG GATACATTGG TTTAAGCTTA ATTATCCGAG GTATTGCATA 12840 

10 

TTTAACAACA AGACGAAACC TTGAAGAACT TGAGCCTAAT AATTATTTAG ACCATGTCAA 12900 

TGACGATGAA GAAACTAATC ATACTGAGGC TGAAAAATCT TCTTCAAATA TTAAAGATGC 12960 

1$ TGAAAAAACA GGTGTAGCTA CTGCATCAAC GGTTGGTGTT GCTAAAAATG ATACTGAAAA 13020 

TACAGTGGCT GACGAACCAA GCATTCATGA AGGTACTGAA AAAACAGAAC CTCAACATCA 13080 

CATAGGTAAT CAAACTGAAT CTAATCATGA TGAAGATCAt GACATCACTT CGGAGTCAGT 13140 

20 

AGAATCAGCm GaATCAGTTA AACAAGCACC ACmAAGTGAC gATTTaACAA ACGATTCAAA 13200 

TGAAGATGAA ATAGAGCAAT CATTAnAAGA ACCTGCGACT TATAAAGAAG ACAGACGTnC 13260 

ATCAGTTGTA ATTGATGCAG AAAAACATAT CGAAAAAGCT GAAGAnCAAT CTTCAGATAA 13320 

25 

A 13321 
(2) INFORMATION FOR SEQ ID NO: 5: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 854 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGTGTTGTA AACTTTTATG TTGAAAAAGC TACTTATCTC AATGAAAACA AGTAGCATTT 60 

40 

AATAAATTAA TTAGTATACA GCTAGTTTTT CTAATTGTTC TTTAACTTGA ATTAAGTTTG 120 

ACCGTATTAG AGAGGCAGAT TGATCCATCG TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 180 

45 AGCCATTACA AACAACTTCA AACTGTTGTG CCATTTGATC AAGACGCGCA TGAGCTTGTG 240 

TGTTTAAAAT AAACATATCG TCATAATGTG ATGGCGAATA GATAATTCGT CGTTGTATAC 300 

AAACGTATAA AAACCTTGTC ATATCAACGG TTTTGGCATT TTTAAACCTC TGTGTTTTCC 360 

50 ACGCATGTTT GCCCTTATTT AAATAATTTG CCCTTTTTTC GCCCCGAAAA AAAAACACAA 420 

AAAAATAACC ACACTCCTAA ATTAATAGGT GGTGTGGTTT TGTTGATTGT AGGGGTATAA 4 80 

AAATAACCGC ATTATTAAAG ATACGGTTAC TCTGTTATCT GTAAATATAA TAGTAGTTTA 54 0 
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AAACAGGACT 


CCACATAAAA 


ATCAACTCCT 


TTATATAPPA 


TAATYIATAPT 


AT ZVTTTTPTIi 
m AX X X ILln 


C C A 




GTTTATTTCA 


ATTTTTCAGT 


TTTTAAAAAT 


unvr i. j. j. v» x o J. 


MTT'I'&TTT AT* 
xxi Xnx 1 X/\l 


A PfiPTTTTPT 
nUUt X 


Ton 


5 


GTTTTCTTTT 


TAAATTTTAT 


CTTTTTGTTA 

w a- x x x x w x x<n 


TTPPATTPAT 


1 0 Innnni 1 V, 


T A TT A A A TT A 


1 □ A 




ACATAAAATT 


TTTCATGCCC 


TATTTTATTT 


fSTmATfJAflA 


TATPA 

1A1 \-nr\ loin 


A APJxPTPA AT 


Oil A 


10 


ATTG TTTTT A 


AATAGATTTG 


ATGCAACG A C 


TfSATAAAPPft 


TATTAPTATP 


luLlnlul in 


y u u 




TTGGTAAAAT 


GCATAGAAAA 


ATATTCTAAT 


TTATTCATGC 


AATATATATG 


GGTTTCATTA 


you 




TACTTCTTAA 


TGAGTGTATT 


TATACCTTGC 


AATACGTCAT 


TACTTTTAAT 


AACAATTTCT 


1 A 1 A 

1020 


15 


TTTTCACCTG 


TCGAAAAAGT 


CCACTGTTTA 


TCTCCTATAT 


TTTCTTTAAT 


TGTTTTCTTG 


1080 




TTGTCAAATT 


CTAAAATTAT 


AGCCCGTAAA 


CACTCTTCTT 


TATAATTCTC 


GTTCTTGAAA 


1140 


20 


GTACGAAGCA AAATTTTTAT AAATTCGGTA TTGGTGACTT TTTTATAAGT GTGATATTTT 


1200 


GCAATCTCTT 


TATCAGTAAA 


GACTGTTCTT 


AGTTCGTGAT 


TATCAAAACT 


TAAATTCATC 


1260 




TTATTCTCTA 


ATTCATTAAT 


TTTATCTTGC 


AAACCAACAT 


TTTCTAAAAT 


TTTCTTGTTT 


1320 


25 


ATCTCCCCTA 


TATCAAAACT 


CCTTTTCGAA 


ATTAATTTTG 


AAAACTCGTC 


TGCCATTTCA 


1380 




ACAGCCTTTT 


CTTTCCTTTT 


ATACCTTTTG 


TTAAATTTAT 


GAACCACCGT 


TGCAGCATAA 


1440 








AATAvjAToaT 


ATTATCGGTA 


TGTATATATC 


ACCTTTCATA 


1500 


30 


TTTCCACCTC 


TTTTAACACA 


ATTAAGTATT 


ATGATACACA 


ACTTGCGCAA 


AAAGATGTAG 


1560 




ACAGAACATA 


ATGGCGAACA 


AAAACAACCA 


CCCAGTAACT 


AGTATGGGTG GCGTAgACTA 


1620 




TAACAACTCT 


ATGTTATCAA 


GATATATGTA 


TCGAGTGATG 


GCAAGGAAGA 


AGTCTCCTGC 


1680 


35 


GGGACCAACA 


GTCAGATATA 


TGGCCTCTGC 


CGGGCTATAT 


AGTTCACTCC 


TACTATATAA 


1740 




AAGTAAGTAT 


AACATAAAAA 


GCACCCCGTA 


AACTGTTATA 


CGGG AATGCT 


AAAGTCATAT 


1 Q A A 

1 BOO 


40 


ATACTACGGG 


GAGTAGTATG 


AAAACTATGC 


TCTCTATCGT 


AAGAAAAAAC 


ACCCAGTGAC 


1 Q C A 


ATGCTTGGGT 


GAACAAGGAT 


AGATGTAAAT 


AGTTGATGCA 


TGTGTAcACA 


TCATAACAAA 


1 Q 0 A 




AAACTAGCCC 


GAAGcTAGCT 


ATAACATAAA 


AAAATAGGCA 


AGTACCGAAG 


TACCTGCCAG 


1 Q C A 

i y o u 


45 


TTACGCACAT 


TTAAATCTTG 


AGAGTAATGT 


TAAAAAGTGT 


ATAGGAATAT 


TAACATCCAT 


*3 AV1 A 




CCAAATAGTT 


ATTTAATAAC 


TGTAAGATTC 


CCTATAATTA 


ATGTAGCaAA 


ATTTTTATTC 


2100 




TAAGTAAATA 


CTAAATCGTG 


CTAAACTTAC 


CAAAACTACT 


TATTCTATTA 


CCTGCCTTGT 


2160 


50 


CTACCTCTCC 


TGTCGCTATA 


TAACGACGTT 


GTCCACTATT 


AGCAATATAA 


GTAATCCATC 


2220 




TATAGCCATT 


GATGCAATAT 


GCGCCGTCAT 


ATTTAATTGT 


TGCGTTATTA 


GGTAATACAC 


2280 




CTGTAATTCT 


TGAATTAGTT 


GAATAGCCGT 


CCCTTACGTT ATT AC CTTTA ACATTGGCAA 


2340 
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20 
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30 



35 



40 



45 



SO 



55 



CTGGCACTGG 

CACCTATAGG 

TCTTAATCCA 

TTATATTTAT 

TTATTTGTCC 

TGAATTGACT 

GAATCTCTAG 

TATTTCCTAC 

CACCATATAA 

TATATTTACG 

AACCATGACC 

TTGCTTGCTT 

ATCTAATAAA 

CCCAACCAGG 

ATGTGAAAGT 

ACTTTGACGG 

TTATTTGCCC 

AACCTTGTAA 

GAAATCCATA 

CTGCTTGGTT 

ATA*FTCGTAG 

TTTTCAGCTT 

TTATCTTTAT 

TCATCTACTG 

ATCAATACGA 

ATAAAAAGCC 

ACACGACCAA 

CCAAAAATAG 

AGAATCCACA 



TGGATTTTTT 

CTTACCATGA 

ATCCATATCA 

TTCTGCTGAT 

TAAGTTATTT 

TGAAATAATA 

AACAATGTCA 

ATTAACACCG 

TGCAACTTCG 

GATAAAATCA 

GGCTACAAGC 

AATAACGCTT 

ATACATTGGG 

TTGCGCAACG 

GTTTAGATTA 

GAAAACGACA 

GTATTTTTCA 

CcTTTCGCCT 

AAACAAATCA 

TTTTGTTATC 

GTGTCATTAC 

TATATTTCTT 

ATGTAGTATA 

GTATCGGACT 

TGTATCTTGT 

AGTGCCGAAG 

AAGCTATATC 

TTTTTAACAA 

TCTTGATGTC 



TGGTTTTTAG 
ATCGCACCGG 
TTTTTATTAG 
ACATTAACGT 
TTAATAACAT 
. ACATGCCCAC 
TACCCATGTG 
TAAGCAGTAT 
TGACCTGCAT 
CGTTCATTTG 
ATAATTTTTT 
TTAGCTTTAT 
TCATCGTAAT 
CCATTTGTCC 
GCGCTCTCAA 
ATGTCCAACT 
ATCCTTGCTT 
GTTGCTATCA 
GGATTGAACT 
AACATTGGTC 
TTCTTTAATT 
TAGCTTTTGA 
TAAAGCAACA 
TATACCTTTA 
TATTACTTTT 
CACTGACTCT 
CTAAAATTCC 
GGCTATAACA 
TCTAATATTT 



CTGATGTTTT 

CTATTAATTT 

TAATAAAACC 

TTAGTAAATC 

CTTGTATACT 

CACTTGCACT 

ATTTAACCCA 

CTTGATACAT 

GTCTTAAATA 

TTCCGTTTCC 

TAGGTTTAAT 

CTCCAACACT 

AATGAACATG 

AACCTTTACC 

CAATTTCAAC 

TTTGCGGTAA 

TATTATCAAA 

TAAAAAAGAT 

GCTTCCCTAA 

AACACCTACC 

GGCGCTTGCC 

TTTGCCCATT 

ACTGTTAAGA 

TTCGCTAAAA 

GCATCCATTT 

TAACTATTAC 

CTTAAGCATG 

AATGTACTTA 

TTAGCATTTT 



AACATTACCA 

AGAATACAAG 

TAATTCAGAT 

ATTACGAGGT 

TTTATCAATA 

TTCTCCTGCT 

ATATAAGCCA 

ATCTTGTGAT 

CTTAGCGATA 

GACTGCTCCA 

TACTGCTTGC 

TACTTTATCT 

TCTTGTAACG 

ATTCCAATTT 

ATGTCCaGct 

AAAGCTATCA 

TGGAATATTA 

ATTTGCGTAA 

TGAATTATCA 

CTAAATCATT 

CTGTTGCTTT 

TACCTTCTTG 

TAATCGATGA 

ACTGATTGAC 

GTTTGCTCCT 

TTACACTTAC 

GTAATCACCT 

GAATCGTCCC 

TCTCTTTAXT 



GCTACCAAAC 

TCATAGTTTT 

AAACGATAGT 

GTTACACCTC 

GTATCTGCAT 

GCGTCTAAAT 

TAATCTTTAT 

TGACTTGAGC 

TTTGGTGTTA 

GGATCGTTAT 

TTTTTGGCAG 

GGGAAATTTA 

GTTTCGGGAC 

TGGCCAAACG 

CCGCCACCAT 

TAGTTTTTAA 

TAAGCGTATA 

TCGTAACACT 

AACCATTTTT 

TGTGTCGTTC 

TCTATACTTG 

AGATGTTGGA 

AACACTTTCT 

TAATGCTAAG 

TTTATCCAAA 

TAAACCAGAA 

CCTTTAAATG 

TATTAATCCT 

TTTTTCATCT 



2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 
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TGCGTTCTCA GACTGTCTTC TATTCTGTCG AATTTTTCAA ACATAGTCTT ATCATTTTCT 4260 

TCTAATCGCG TTAAACGCCA ATCTTGTTCG TGTCGTTTGG TAAATCCAAA CATTACACCA 4320 

5 

CCCACTTTAT TCAAATTAAA AAGCCATAAG ATTATAACCT ATGACTCTAG ATTTTCTGGA 4380 

TACTTTTCTC CTGTAATAAT TGCATATTCC TCTTTATCTA TAACTTCCAT ATCTACATAC 4440 

CACGCTATAT CTTCTTTACT ATATTCTTTC AATTGATACC ATGTTTTAAT ATCTTCGAAT 4500 

GTTGGTGAAA TTAATTTAAG CATTTTCAGT CTCTCCTTTA ACCTCTTCTA ATTTTTTATT 4560 

AAGTGTCACA AGTTGTTTTG CCATTAGTGC ATTTTGCTTA TTAACTTGCA TCGATAACTT 4620 

15 TGTACTTTGA ACAACTTGTT TCTGCATACT AGCAACCATT TTTCGTAAGA TGTCATCAGA 4680 

AGCGACTGTG TTTTGTTCTT CACTGTCAAT CTGTTGATGC AAGTCATCTT TTTCTTCTGA 4740 

ATAATCTTCG TTAAAAACTA TTTCCCCATT TGAATATTTA AAGGCTTTAG GTCTAAAAAC 4 800 

20 

TTGAGAGAAA TTTTCTGGTA AATTTTCAAT ATCAATACCT TCTTCAAAGC CACCAATGAT 4860 

AGCGTATGAA ATTATCTCAT TACGCTTGTT AACTAATATT TGCATTATTT TCTCACTCCT 4 920 

ATAATTTTGT TAATTGTCGC TCTATTTGCG TTCGCACCAG AGCCTCTTTG ACTTCCTAAG 4 980 

25 

TCGAAATAGA CATCGTTTGA TATAGTTAAA GATGTACGAC TAGATTTAGT TAATCCAAAC 5040 

TCATAAACAC CTCCACCATT TCCATCACCA TCTGGAAGAT TTGAGGGATT CAATGAAATC 5100 

30 TTTCCTCCTC CAAAAGGACT GCCAAACTCT GTAAAGTCAC CACCTGGAAA AGTCCCATAA 5160 

AAAATTAATA AAATAAATTG GTCTAAACTC TCATTTAAGT ACAATGTAGA GCCCACACCA 5220 

TTTGCTGTTC CATCAAAAAT AACCGAATAC CTTTTATTAA ACTTGTCATC TGCGTATAAT 52 80 

oc 

TTAGCGTTAC TTTCGGCCAT ATTAGCTTTT GATTGGGCAC TTTGAACAGT TTCAAAAGGT 534 0 

GTATTGTAAT CATTAATAGC TAATTCTGAC CACTCAGACC ATGAACCCGC TTCTTTTCTT 5400 

TTAACAAATA CTTTATTTGT ACCGTTCGGT CGATAAGTCA TACGCTTGTA ATCTGAAGTT 54 60 

40 

ACTACTAAAT ATTCGACAGT ACCGTTAGTA CTAACACCTC TTGGATAATT TATAGCTTGC . 5520 

GAAACATAAA TAAATTGGGT TGAATCACCT ATTCTTTGTT CTGGATTATT AAAATCAAAT 5580 

45 CCAGTAATCT GCATTATCTT ACCATCATCT TTAGTAATCT TAGCTTTTTG CCAATTTGAA 564 0 

GTAGAACCAC TTGTGACTAA ACCACCACTA TTCACTGACT GCTTGAAGGC TTCATGTTTC 5700 

TCATCCATAT ATCGCTTTTG CTCATCGAAT GTTCTTGAAT ATGCTTGCGC TTTATTTTCC 5760 

50 AAATCAGATA TATGGCTATT AGCAAGTTGC TTTAATTCAT CTATACTTGA AGATTTTGCT 5820 

- ATTTGAATAT CTGATAGACC TTTTTCTTTA GCTTTTTCAA TCAGACTCGC ATAATCTTCA 5880 

CCATTTTTTA TAGCCTCGTC CATTGCTTTC GCACGATCCA TAATAGTTTT TTCTAATTCC 594 0 
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TCAACGTTAA ATGTGATAGT TCTCTCGACA ACTACCACGT CTGAATTACC TAATTCTGCA 6060 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

5 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT GACTCCTTTA 6180 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 6240 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TCACTCTCTC CTGTTCTAAA AGTTATATCT 6300 

10 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 6360 

AAATTACTCA TTAACTTATA ATTCTCCCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 6420 

15 TACTATCATA ATTTTTCAAT AGTATCTCAG CAGATGCTGT AACACTATTA CGAACTAGCC 64 80 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 6540 

TACGTTCAGT TGGAAAATCT GTAAAACGTT TTGTATCATC CGTAGTTAAA TAAAACGACA 6600 

20 

TGCCTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC TCCCATTTAC 6660 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 6780 

25 

CGCCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6840 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG 6900 

30 TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6960 

CTGTGTCACT CaTGATAAAA GGAACGCCTC TTGAGTGAAG TATTTCTAAA ATACCTCTTT 7020 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaTCACC ACCGACAGTA ACACCTAGCA 7080 

35 TCAAAGCTTT TTTACCACTA TCTTTGTCAT AGTATATTTG CAAACCTTtC TgCTTCCGCA 7140 

AATTCGCCAG GAAATGAATC tAgTGTTCCA CCATAGTCAG CATTAACCTG ATACGCTTCT 72 00 

TCTCCTGTTT CTAAATCGAA AGCCGTTAAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 7260 

40 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT GCATTGGTTG GGTTTCGTTT 7320 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 73 80 

TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTTCTAGGT 74 40 

45 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 7500 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 7560 

50 TTAGGCGTAT ACTTGAAACG AACTAATGTA TTCTCATTAT TACCATTTAA GATAAAACTA 7620 

TAAATCCATA ACTCATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 7680 

ACAATCAATG AGCTGTCTAT AAATTGACCA TTAGGTCTTA GACGACTTAG CATATAGCCA 7740 
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ATTACTGCAT 


TTGTAAgAGG 


TGCAAGTTCT 


GTCACAAATA AAAATTCTTG 


CTTATCAGGT 


7860 


TCAAAACGAT 


ACTCGATATC 


AAGAATTTCT 


TGTTTGGTCT TATTTAATTC 


TCTTATAGTT 


7920 


TCCTCTTTAT 


TAATTTGAGT 


TTTGGTTTCC 


CAATCGTCTA AATGTTCTTT 


TAATGTGTCA 


7980 


AAGGTTTCGC 


CGTTTACATT 


AACTCGAGCT 


TGAACAATCT CATTAGCACT 


GTTATTACGT 


8040 


GGTGCCACAA 


CAAGTGCGTT 


AATTTGACTT 


TGTAAAGATT TGTTTACTGC 


TGCTTGCGAT 


8100 


CTACCATTAT 


AATAAATTTG 


CTCAGCGAAG 


TGTTGAATTG TTTTAGCTyT CTGATGCAAC 


8160 


TTAAACTCTG 


TTGTCAAGCC 


AAGCGCAAAT 


TGCTCTATTC TTTGTAAGTT 


TTGTATTTCC 


8220 


TTAGCTCTAT 


AATCTCGACC 


TGCTAAAGCT 


CCCAAATCCT TTATTAAATA 


CAAATTTTCC 


8280 


ATAATGCACC 


TTCCTTTCTA 


ATAAAATAGC 


ACTGTACCAA GTTTCCCACT 


ATCGTCAACT 


8340 


GTTATTTTCC 


ACAATTTACC 


GTTTGGGGAT 


TTCTGTACAA TGCTATTTTG AATAATTgcC 


8400 


TGctTCGCCT 


ATTTTTAAAT 


TATCTAATTT 


ATTTkTATCA TTTACCGAAA 


TGATACCGTC 


8460 


TTGAGGCAAT 


CCATCAATAn 


CACTACTGCC 


TGCATAAGGT ATCCCATTTA 


TAGCTTTCCA 


8520 


ATGTGTAGCT 


GGAAAGTACT 


GTTTATCGT 






8549 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3601 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
AGGCGTGTAG TGACTTACGG nTAGGAAACT ATGTATCCGA ATGATTTATT GAGACCAAAA 60 
AGGCATTAAA GTCCATTGAA ATATCnGGTA GCGmGTTGGT ACgTGGACGT GGGGGCCCTA 120 

40 

GATGTATGAG TCAACCATTA TTCAGAGAGG ACATTTAACG TAATAAATTA TAGAmACGAG 180 
GGTGAAAATA ATGACAGAAA TTCAAAAACC GTATGATTTA AAAGGCAGAT CATTATTAAA 24 0 

45 AGAAAGTGAT TTTACCAAAG CAGAATTCGA AGGACTTATT GATTTTGCAA TTACATTAAA 300 
AGAGTATAAG AAAAACGGTA TTAAGCATCA CTACTTATCT GGAAAAAATA TTGCACTACT 360 
ATTCGAAAAG AATTCGACGA GAACGCGTGC TGCGTTTACA GTTGCGTCTA TTGATTTAGG 420 

50 TGCGCATCCA GAATTTTTAG GAAAAAATGA TATTCAATTA GGCAAAAAAG AATCTGTAGA 480 
•GGATACTGCG AAAGTATTAG GTAGAATGTT CGATGGTATT GAATTCCGTG GTTTTTCACA 54 0 

ACAAGCTGTT GAAGATTTAG CGAAGTTCTC TGGTGTACCG GTGTGGAATG GATTAACAGA 600 
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TCTAGAAGGA ATAAACTTAA CTTACGTTGG AGATGGACGT AATAATATTG CGCATTCATT 720 

AATGGTAGCA GGTGCTATGT TAGGTGTTAA TGTAAGAATT TGTACACCTA AATCATTAAA 780 

TCCAAAAGAG GCATATGTTG ATATTGcAAA rGAAAAaGCG AGTCAaTATG GTGGTyCAGT 840 

CATGATTACG GATAATATTG CAGArcCAGT TGAAAaTwCm GATGCTATAT ATmCAGATGT 900 

TTGGGTATCG ATGGGTGAAG AAAGTGAATT TGAACAcGTA TTAATTTATT AAAAGACTAT 960 

CAAGTGAATC AACAGATGTT TGATTTAACA GGTAAAGATT CAACGATATT CTTACATTGT 1020 

TTACCAGCAT TCCATGATAC AAATACACTT TATGGACAAG AAATTTATGA AAAATATGGA 1080 

15 TTAGCTGAAA TGGAAGTTAC AGACCAAATC TTTAGAAGTG AACATTCAAA AGTGTTTGAT 1140 

CAAGCTGAAA ATAGAATGCA TACAATTAAG GCAGTAATGG CAGCAACATT GGGGAGTTAA 1200 

TCACTAAATG GAACGATATG AATATGATGT GTCTGATGAT ATAAGTGTCA TGTACAGACA 12 60 

CCTCATATTG GTATTAAAGG AGAAATGAAT ATGAACGAAT CAGGAGATAA CAAACTCAGT 1320 

AAATCTTCTT TAATTGGACT AGTTATAGGA TCCATGATTG GTGGCGGTGC GTTCAATATA 13 80 

ATGTCTGATA TGGGCGGTAA AGCCGGTGGA TTAGCCATTA TTATTGGTTG GATTATTACA 144 0 

GCTATAGGAA TGATTTCATT AGCGTTCGTA TTTCAAAATT TAACCAATGA ACGGCCGGAG 1500 

CTAGACGGTG GTATTTATAG TTATGmTCAA GCAGGATTTG GCGATTTTGT AGGATTTATC 1560 

30 AGTGmTTGGG GATATTGGTT CTCAGCGTTT TTAGGCAATG TTGCCTATGC AACACTATTG 1620 

ATGTCAGCAG TAGGTAACTT TTTCCCGATT TTTAAAGGAG GCAACACATT ACCAAGTGTT 1680 

ATTGTCGCCT CGTTACTACT CTGGGGTGTC CATTTCTTGA TTTTAAAAGG CGTTGAAACA 174 0 

GCAGCATTTA TCAATAGTAT TGTTACTGTT GCAAAGTTAA TACCGATTTT ACTTGTAATC 1800 

ATATGCATGA TAATTGCATT CAATTTTGAC ACTTTTAAAA CAGGCTTTTT CAGTATGACG 1860 

TCAGAGGGTG TATTGCCATT TAGTTGGGCG AGCACAATGA GCCaaGTtAA AAGTACGrTG 1920 

CTAGTGACAG TTTGGGTGTT TATCGGTATC GAAGGTGCAG TAATTTTTTC TAGTAGAGCT 19 80 

nAAAATGAGA AAGATGTAGG TAGTGCCACG GTTATAGGAC TTATATCAGT TTTAATTATC 2040 

TATyTCTTAT TAACTGTATT AGCTCAAGGC GTGATTTTGC AAAATCATAT TTCGCAATTA 2100 

GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG ATCTACACTT 2160 

GTAAATATTG GTTTAATTAT TTCGGTACTA GGTGCATGGT TAGGATGGAC ACTGCTTGCT 2220 

50 GGTGAATTAC CTTTCATTGT TGCAAAAGAT GGATTATTTC CAAAATGGTT TGCTAAAGAA 2280 

AATAAAAATG GAGCACCTGT AAATGCACTG CTTATTACCA ATATATTAGT ACAATTATTT 2340 

TTAATAAGTA TGCTATTTAC ACAGAGTGCG TATCAATTTG CATTTTCACT AGCATCAAGT 2400 
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CGACAGCAAG CAACTACTAA ACAATGGACG ATTGGTATCA TAGCCTCAAT TTATGCTATA 2520 

TGGCTTATAT ATGCAGCAGG TATCAATTAC TTATTATTGA CGATGTTACT TTATATTCCA 2580 

5 

GCTCTTCTTG TTTATACaAT CGkTCmAAAG rATwATCAGa CACGTTTGAT TAAATCAGrC 264 0 

TATATTCtTT TTATGATTAT tATCGTACTT GCAGTTATCG GGTTAATTAA GTTATTGATG 2700 

10 . GGAACGATAA ATGTTTTTTA AAAGGAGCGA CAAAAATATG AAAGAGAAAA TTGTCATTGC 2760 

ATTAGGCGGT AATGCGATAC AGACAACAGA AGCAACAGCT GAAGCACAAC AAACAGCTAT 2820 

TAGATGTGCG ATGCAAAACC TTAAACCTTT ATTTGATTCA CCAGCGCGTA TTGTCATTTC 2880 

15 ACATGGTAAT GGTCCACAAA TTGGAAGTTT ATTAATCCAA CAAGCTAAAT CGAACAGTGA 294 0 

CACAACGCCG GCAATGCCAT TGGATACTTG TGGTGCAATG TCACAGGGTA TGATAGGCTA 3000 

TTGGTTGGAA ACTGAAATCA ATCGCATTTT AACTGAAATG AATAGTGATA GAACTGTAGG 3 060 

20 

CACAATCGTT ACACGTGTGG AAGTAGATAA AGATGATCCA CGATTTGATa ACCCAACTAA 3120 

AcCAaTTGGT CCTTTTTATA CGAAAGAAGA AGTTGAAGAA TTACAAAAAG AACAGCCAGA 3180 

CTCAGTCTTT aAAGAAGATG CAGGACGTGG TTATAGAAAA GTAGTTGcGT CACCACTACC 324 0 

25 

TCaATCTATA CTAGAACACC AGTTAATTCG AACTTTAGCA GACGGTAAAA ATATTGTCAT 3300 
TGCATGCGGT GGTGGCGGTA TTCCAGTTAT AAAAAAAGAA AATACCTATG AAGGTGTTGA 3360 
30 AGCGGTTATA GATAAAGATT TTGCTAGTGA GAAATTAGCA ACGCTGATTG AAGCAGATAC 3420 
CTTAATGATT CTTACGAATG TAGAAAATGT ATTTATTAAC TTTAATGAAC CTAATCAACA 3480 
ACAAATCGAT GATATTGATG TAGCAACACT GAAAAAAtAC GCGGCACAAG GTAAGTTTGT 3540 

35 

GGAAGGATCG tGTTGCCAAA AATAGAAGCT GCGtACgtTT GTTGAaAGtG GGGaAACCAA 3600 
A 3601 
(2) ."INFORMATION FOR SEQ ID NO: 7: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

50 

CGACACTATT AAATGAATTA GAGCACAATC TAACAAATCA AATTCATTTT TCAAAAGATG 60 

AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CGATCCTGTT TCAACAAAGC 120 

AACTTGCGCA AGATGTTAAT GTTTCGCGTC GGACAATTGC AGATGATATT AAAATGATTC 180 
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TTATTGGTGA GGAAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 300 

AACAAGCTGC ACCTTTTATA GAGGCGGATA TCTTTAATTC AGAATCAATC GCATTGGTTC 360 

GCCGTGCCAT TATTAAGACA T^AAATAGTG AAAATTATCA TTTAGTTCAG TCGGCTATCG 420 

ATGGCTTAAT CTATCATATA CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 4 80 

ATATACCTAT CAATGAAATT GATAAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 54 0 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 573 
(2) INFORMATION FOR SEQ ID NO: 8; 

75 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

AAATTTTCTT TTTCTTTATC AATCTGaTkG TAATTAACaC TTTCGaCTTC TGTAGGAATT 120 

CTAATGTCAA CAGAAGCATT GATATAAGCT TGATGTTGCA TGCAATCACA CTCCTAATCC 180 

TTCATmTtnAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 240 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 300 

ACAATTACTT CATCATGGAC ATGGCCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 3 60 

35 ATAGAAATCG CAAGTAAATC CCTTGCAGTT GCTTGAACAA TATTCTCGAC TAACTTCCCA 4 20 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 4 80 

TGACXACCCC AACTATTTTC ACCAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 540 

GGCAGTTCAA TCATTAGAAA ACCTTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 600 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA AAAATTAACT 660 

ATGTTAGGAT TTGCGTTACG CCAACTATCA ACTAAACCTT GTAACTCGTT TTCTTCAATG 720 

CCCATTTCCA ATGCACCCAT TGCTTTTAAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 780 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 840 

SO ACATTAAACA TTTGAGAAGC CGATGCTTCA. TATATCTTTC CGTGTGTGTT GAATACATCT 900 

AAACGCCATT GTTCTTTTGC ATACCATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 960 

CTTACTGCTA GTTCATTACC TTCTTCAGCA GTAAATGTCG TCCTAACTAA TTGACTTAAT 1020 
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AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT GCTTTGTTAA ATTCTGAAGT 1140 

TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 1200 

5 

ACCCGTTCAT CACTGCACAT C 1221 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH : 1090 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 60 

20 

AATAAAAAGA AGCGGCAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 

ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG 180 

ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 240 

25 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA AATATATTTT 3 00 

TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 360 

30 TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 4 20 

CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 4 80 

TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 540 

35 TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTTTTATTCG 600 

AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 660 

TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 720 

40 

TACCTATTGG TAAAAAATAC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 730 

AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA 840 

TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 900 

45 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 

GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 1020 

50 AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 1080 

CAATAAGAAA .1090 
(2) INFORMATION FOR SEQ ID NO: 10: 
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(A) LENGTH : 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TTAGGACTAT TTTATCATAT TCATTTAAAT TACGGCTAAA AATTTTAAAA ACGGGGATTA 
ATATATGGAA TTAAGCTATG AAAGTTAATT GATACTTGCA TTTTACGCTG ATTTATATAA 
GAATAACTAT TGTATAGTTT TAAAAACGAA CGTACGTTTG CAGGAGGCGA AATCATTGGC 
AATGAATAAA CAAAATAATT ATTCAGATGA TTCAATACAG GTTTTAGAGG GGTTAGAAGC 
AGTTCGTAAA AGACCTGGTA TGTATATTGG ATCAACTGAT AAACGGGGAT TACATCATCT 
AGTATATGAA ATTGTCGATA ACTCCGTCGA TGAAGTATTG AATGGTTACG GTAACGAAAT 
AGATGTAACA ATTAATAAAG ATGGTAGTAT TTCTATAGAA GATAATGGAC GTGGTATGCC 
AACAGGTATA CATAAATCAG GTAAACCGAC AGTCGAAGTT ATCTTTACTG TTTTACATGC 
AGGAGGTAAA TTTGGACAAG GCGGCTATAA AACTTCAGGT GGTCTTCACG GTGTTGGTGC 
TTCAGTTGTA AATGCATTGA GTGAATGGCT TGAAGTTGAA ATCCATCGAG ATGGTAATAT 
ATATCATCAA AGTTTTAAAA ACGGTGGTTC GCCATCTTCT GGTTTAGTGA AAAAAGGTAA 
AACTAAGAAA ACAGGTACCA AAGTAACATT TAAACCTGAT GACACAATTT TTAAAGCATC 
TACATCATTT AATTTTGATG TTTTAAGTGA ACGACTACAA GAGTCTGCGT TCTTATTGAA 
AAATTTAAAA ATAACGCTTA ATGATTTACG CnwGGgTAAA GAGCGTCAAG AGCATTACCA 
TTATGAAGAA GGGAt CaAAG rGTTgTTAGT atGTCCAaTG ArGGAAAAGA AGTTTTGCCT 
GACG 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GATTTCTAAA TCAAGATCTG TTTTACGATA ACCATTCAAA CCTTGACGTT CATCTTCTTC 
AGGTTGATTT TGTTGCTGTG TGTCTTTGTT GTCAGAAGTC GCTACTGTTT TTTTATTATC 
TGTTTCTTTA GTCATAACAA ACGCCTCCGT TATAAAACGC TATATTTAAT GATATGTGAT 
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TTAATAAGAC GATTCAGCAA GTTTTAAAGT ATTATTTGAC TATGTTGGAT TAGGCATCTA 300 

GTCCTATAAT ATCACTGACA TTGTCAAAAT GATGATCTTT TAAGTAACGT GCGATGCCTT 360 

TGTTCATTTT CTTAGTTAAA CCTGGGCCTT CAATAACAAG TGATGAATAA ATTTGAATAA 420 

GTGACGCACC GTGACGCATC ATTTTGATTG CATCTTCAGT ACTGAATACG CCGCCTGTAC 480 

CTATAATTAA AAATTCACCA TTTGTTTGCT GATAAgCATa CTTAATCAAT TTTAAATTAC 54 0 

GTTCAAATAA TGGACGACCA CTCAAACCGC CTTCTTCGAC TTTATTAGCA GAAGTTAAAC 600 

CATCTCGTTG TCGCGTTGTG TTTGCTAAGA TGATACCGTC AAATGTCTCA GTAATCGCTG 660 

15 GTAATAGTGC TTTTAAGCCA TCGAAATCCA TATCAGACGT TAGTTTTAAA TAAATTGGCA 720 

CTGTTACATC ATGTTGTTTT TTAAATGCTG TTAAAGCTTG GCATAACATT GAAAATTCAT 780 

CTTTATCATG GAAGTTTTGA AGATTTTCAG TATTTGGAGA ACTGATGTTG ACTGTGAAAA 84 0 

20 ATGAAACGTC GTGTTTAAAC GTATCAATAA CCTTTATATA ATCTTGATAA CGCGCTTCAT 900 

AAGGTGTCAT TTTATTCACA CCAACATTGA TACCAACAGG TACTTGATAA GCATTTTTAC 960 

GCAAATGACT TAGTGCTTTG TTCATAGCAA TATTATTGAA GCCCATTCGA TTTATCAAGG 1020 

CGTCATCTTC TAATAATCTA AACATGCGTG GTTGAGGGTT ACCCGGTTGA GGTTTAGGTG 1080 

TGATACCACC TAATTCTAAA GCACCGAATC CAAGGTGTTC CAATGCTTTT GGTACTTCGC 114 0 

AAGATTTGTC GAAACCAGCT GCTAAgCCAA TTGGATTGTC GTACGTATTA CCTTGTATCG 120 0 

TTTGTGATAA CGTTGGATTC TTATAAGTAA ATAGTTTATC GACGACTGGG AATAAAACCG 1260 

GaAACTTTTG TaACGTTTTT AATGCATCGA TAGTTAGTCC GTGTGCTTTT TCGGGTTCGA 132 0 

35 TTTTGAATAA GAAAGGTTTA ATTAATTTGT ACATGAGTAT GCTCCTATTT CATTATATTT 13 8 0 

GAGGCTTACT ATCCTCAACT TAATATATGT GAAATATATT CTTTTAATAG ACTAGCATTT 144 0 

CCATACATAA TTTCCTAGTT AAAACTAAAA AGTTTTGAAA ATTGACGCAA gTTTGAATAA 150 0 

40 CGTTTTTAAG ATTAAATCAT CCTAATTAGG CAATATTATA GTATAAAGTA AGTAGATTGG 1560 

AAGGTGTTTG TATGAATGAA CAATGGTTAG AGCATTTACC TTTAAAAGAT ATTAAAGAGA 1620 

TTTCACCAGT GAGTGGTGGT GATGTAAACG AAGCATATCG AGTCGAAACA GATACGGATA 16 8 0 

CATTTTTCTT ACTTGTCCAA CGTGGACGTA AAGAATCATT TTATGCTGCA GAAATTGCAG 174 0 

GTTTAAATGA ATTTGAACGT GCAGGTATCA CGGCACCTAG AGTAATTGCA AGTGGCGAGG 1800 

TTAACGGTGA TGCGTATTTA GTGATGACGT ATTTAGAAGA AGGGGCTTCA GGGAGTCAAC 1860 

GCCAATTAGG GCAACTCGTA GCTCAATTAC ACAGTCAGCA ACAAGAAGAA GGCAAATTTG 1920 

GCTTCTCATT ACCTTATGAA GGTGGCGATA TTTCTTTTGA TAATCATTGG CAAGACGATT 1980 
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GGCTATGGGA TGCCAACGAT ATCAAAGTAT ATGACAAAGT GCGACGTCAA ATTGTGGCGG 2100 

AATTAGAAAA GCATCAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

5 

ATATGTTCTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 2220 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 2280 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTTATCGTT 234 0 

10 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 2400 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 24 60 

15 ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT * 2520 

ATTTGTGCGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 2580 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 2640 

20 AGGAATGATA CATATTTATT GATAGAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 2700 

GTATAATTGA ATGTTTGAAT ATCATATATT GATACAGTTT CTAATAATTT TAAAATAATT 2760 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATTTTTA AGACCAATAG 2820 

25 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AACCGGATAA 28 BO 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 294 0 

3Q GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACTGCG AGCGATAATA TACAAAAAGT 3000 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTGGTAAT GAAATTGGTA AAAATTAAAA 3060 

CGTGGTGAAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AGCGAAACAT 3120 

35 GAGCAAATGG ATGACTTATA TGATGAAAAG CGAGAGGTTA AAGCATTGAT AGATGAAAGT 31 BO 

GATGCGCTTA ATCATTCGAT AGATCAATTA TATCAACATT TAGGTGAGCG TTATTATAGT 324 0 

AGCAATATGG CTAGTCGTAT GGAACAGTTC CGCGATGAAT TTCATTTTGC GAAACGACGT 3300 

40 

TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 3360 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT GACAAAGGAG 342 0 

GAAAATAAAT GGAAACAATA GGAAGCATTA TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 3480 

45 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 3540 

CTGCATGTAA ATATCCGATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 3600 

SO ATATAGATTC AGTTATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTTCAAGAGT 3660 
TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC . 3720 

AATAAAGAAA TACTTTTTCT TTATTGGGGT GGGACGACGA AATAAATTTT GTAAAAATAT 3 780 
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ATGTCATTCA TAATCATTTG AACTAAACGT AGCAGCCTTA AATTTTAAAA AAAGACACAT 3900 

ACCAACTTCC GAAATGTAGA TGAATTCTCT ACAATAACGG AAGTT T TTCT TTTAATATTG 3960 

AAATTTCTCA AGGATAGGTC TATACTTTAT AAATCGTAAT TATTACGATT TATAATCAAA 4020 

AACAATAACT TGAAATAGAT CATTGAGGGA GTGTTAATAT GCAACATCAT AAAGTGGCTA 4080. 

TTATcGGTGC CGGTGCTGCA GGTATAGGTA TGGCCATTAC CTTAAAAGAT TTCGGTATAA 4140 

CAGATGTCAT TATTTTAGAA AAAGGAACAG TAGGACATTC ATTTAAACAT TGGCCGAAAT 4200 

CGACCCGTAC GATCACGCCA TCATTTACGT CTAATGGATT TGGCATGCCT GATATGAATG 4260 

15 CAATTTCCAT GGATACTTCA CCAGCATTTA CATTTAATGA AGAACATATT TCCGGAGAAA 4320 

CATATGCTGA ATATTTACAA GTGGTTGCCA ACCATTACGA GCTGAATATC TTTGAAAATA 43 80 

CAGTTGTCAC AAATATATCT GTAGATGATG CATATTATAC GATTGCAACG ACAACAGAGA 4440 

20 TATATCACGC GGATTATATC TTTGTCGCAA CAGGTGATTA TAATTTCCCT AAAAAgCCAT 4500 

TTAAATATGG TATTCATTAT AGTGAAATTG AAGACTTTGA TAACTTTAAT AAGGGGCaAT 4560 

ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT TTGATGCTGC ATATCAACTT GCAAAAAATG 4 620 

GCTCTGACAT CGCACTTTAT ACTAGCACAA CCGGTTTAAA TGATCCGGAT GCTGATCCTA 4680 

GTGTTAGATT GTCACCTTAT ACACGTCAGC GACTAGGTAA TGTCATTAAG CAAGGTGCTC 474 0 

GCATCGAAAT GAATGTACAT TATACAGTTA AAGATATTGA TTTTAACAAT GGACAGTATC 4 800 

ATATCAGTTT TGATAGCGGA CAAAGTGTGC TTACACCTCA TGAACCAATA CTAGCAACTG 4860 

GCTTTGATGC AACAAAAAAT CCAATCGTTC AACAATTATT TGTGACAACA AATCAAGATA 4 920 

35 TTAAATTAAC AACACATGAT GAATCGACAC GTTATCCGAA TATTTTTATG ATTGGTGCAA 4980 

CAGTTGAAAA TGATAATGCC AAATTATGCT ATATCTATAA ATTTAGAGCG CGATTTGCAG 5040 

TACfTGCACA TCTTTTAACA CAGCGGGAAG GcTTACCAGC . TAAACAAGAT GTCATTGAAA 5100 

ATTATCAAAA AAATCAAATG TATTTAGATG ATTATTCATG TTGTGAAGTG TCATGCACAT 5160 

GTTAGAAGTG AAATATGATA TGAGAACTGG GCATTATACG CCCATACCTA ATGAACCTCA 5220 

TTATTTGGTT ATTAGTCATG CGGATAAACT TACCGCAACA GAAAAAGCGA AATTAAGATT 528 0 

ATTAATCATA AAACAGAAAT TAGATATTTC ATTGGCAGAA AGTGTAGTTT CTTcGCCTAT 5340 

AGCGAGTGAA CATGTGATAG AACAATTGAC ACTATTTCAA CATGAGCGAC GACATTTAAG 5400 

ACCTAAAATA AGTGCGACAT TTTTAGCCTG GTTGTTGATA TTTTTAATGT TTGCATTGCC 5460 

AATCGGTATC GCTTATCAAT TTTCAGATTG GTTTCAAAAT CAGTATGTGT CAGCATGGAT 5520 

AGAATATTTA ACTCAAACAA CATTGCTCAA TCACGATATA TTACAGCATA TATTATTTGG 5580 
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ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 5700 

GGCAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TGCCACTATT 5760 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG .5820 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGTAGT TCTTGTAGTT ATCAAATAGG 5880 

TGCGACATTA TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 5940 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA 6000 
CTTAGCGTTC CGCTACCTTA TGATAGGCAA TTACATATGC CAAATATACG TCAAATGTTG - 6060 

15 CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CGCTACCTAT TTTTATCACA 6120 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA ACGCCAATTT TGAATGTTTT ATCACAAATA 6180 

TTTACACCTA TATTATCGTT ATTAGGCATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 6240 

TCAATGATTC GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 6300 

GGAATGACAG CAACACAGTT ACTACTACTT GTGTTTTTTA GTTCAACATT TACAGCGTGC 6360 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 6420 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGCATCAT TGTTAAAATA 64 80 

GTAATGCTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 6540 

AGGAGATCTA TCTTGGAATA TGCTATTCAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6600 

AATAGTTATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA ATCATTTGCT 6660 

CGCTGAATCA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 6720 

35 GAGTGTTGAT GGTGCCTTAG TATCTGAAGA CATACCTTTA AGTGAACTGA CAGAGGGGTG 6780 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 6840 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 6900 

TGCTTGTTTA ACGCCTATTT TAGCTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 6960 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 7020 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 7080 

AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 7140 

CGGTATGCAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 7200 

50 CAATAAACAT AAGTCTTTAC ATCAAATGAT TAATCATCAA TTTGTGCAAT CACCAGTCAA 7260 

ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGCCTT CCGTCTCATA TTTATAGGTT 732 0 

GAAAGGGTTT ATGACATTTG AAGACACCGC ACATACGTAT CTCATTCAAT TTACACAAGG 7380 
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CGGAAAGGGT 


ATTTCAAAAG 


AAGACTATCA 


ATGTTTGGAA 


CAGTAGTGTT 


TTCAGTGGAA 


7500 




GAGAATGGTT 


AACATGCCTT 


CATGTATAAT 


AACGAGTTGA 


TTTGAACGTT 


TAAGCGTAAA 


7560 


5 


TAAAAATAAG 


CTTGGTCAGC 


CATCAAATAT 


AATTTGAAAA 


CTGTCCAAGC 


TGTTTTATTA 


7620 




GAGAACAATC 


AATTAACCCC 


ACATATTTAA 


TAATACATCA 


GCAAAGCCTT 


CAGGTTTTTG 


7680 


10 


AATATAACCT 


AAGTGACCGC 


CTGGAATATC 


TACAATAGGT 


ATGCCAGTTT 


CTTTATTTAT 


7740 


ATAAAAGTTA 


ACATCTTGTG 


GGAAGGAGCC 


TCTAGAATCT 


GTCCCATTTA 


GTAGGGTGAT 


7800 




TTTATCGCTG 


TATTTTGTGA 


AATCATCCAA 


AGTAATATCT 


GAATGCGTAT 


ATTGTCTAAT 


7860 


15 


TTCAAATTCT 


GACCAGAACA 


TCGTACGTTT 


GTACTGTTCT 


ATACGTCCTT 


CTTCAGTATC 


7920 




AGCAGGTTGA 


GACATCATTT 


TTGCATCAAT 


TGGTGCGATA 


TTTAATGTTT 


CGCCAAATGT 


7980 




TTTCATGCCT 


TTTTCTAAGC 


CTTCTGTTAA 


AATTTGATGC 


ACAATGTCAT 


CATTTTTATC 


8040 


20 


TTTCCAATAA 


GTACTGTCTG 


GTAAAAATGT 


ATTAATTGGT 


GGTTCGTGAA 


ATGCAATCTT 


8100 




TTTAACGACT 


TCAGGGTAAT 


CTTTTAACAC 


ATGCATCGCA 


ACGATTGAAC 


CTGAACTTGA 


8160 




ACCTAATATA 


TAGACAGGTT 


CATCACTTAA 


TGACTTTGCA 


AGTTCGGCAA 


TGTCCTGTGC 


8220 


25 


GTCGCGTTTG 


ACACGATAAT 


CACTGTCAGG 


GTTTGAAGCG 


GAATCAGGGA 


GTGGTTCAGT 


8280 




TAACTCGCTT 


TCTCCATAAT 


CACGACGATC 


AACGGCTACA 


ACAGTAAAAT 


GGTCTTTTAA 


834 0 


30 


CTGTTCTGCA AGAGGCAGAA 


AAATGTCTCC 


GGTACCGTTT 


GCACCAGGAA 


TAAAGATGAG 


8400 


CACGGGTCCT 


TGTCCGACTT 


GGTGGTATCG 


TAATTTAGCG 


CCTTGTAATT 


CTAAAGTTTC 


8460 




CATATTCAAT 


GACCTCCATT 


TGTTAATTGT 


TAGGTGATAA 


ACCTAATAAT 


TTAGCACCAT 


8520 


35 


TTGTATAACT 


TATTTTCTCT 


TTTTCTTCAT 


CTGTTAAACC 


CAGTTCATCT 


AAAAATACAC 


8580 




CTAATTTTTC 


AGGCTCAATA 


TATGGATAAT 


CAGCAGCATA 


AAGAATTCTA 


TCAATACCTA 


8640 




CTTCTTTCTT 


GACTAAATCA 


AACTGTGGCT 


TCGTTAACAT 


GCCACTCGGT 


GTGATATAAA 


8700 


40 


AATTATTTTT 


AAAGTAATAG 


CTTACAGGGT 


GGTTCAAATG 


TTCAGCGAAT 


AAAGCTTCAT 


8760 




CCATACGTTC 


TAAGAAGAAT 


GGGATAAACT 


CACCCCAATG 


TCCAATAATC 


ATATTTAACT 


8820 


45 


TTGGATAACG 


ATCAAAAATA 


CCAGATAATA 


CTAGATGTAT 


TGTATGAATG 


CCGACATCAA 


8880 


TGTGCCAACC 


ATAACCAAAA 


CAAGCAAATG 


TTGCCGCAGT 


TACTTCAGGA 


TAATTTCCTT 


8940 




TATAGTATGA TTGATAAATG 


TCACTGTTAA 


CTGGCGCGGG 


ATGTAGATAA 


ATCGGTACGT 


9000 


SO 


CTAAATTTTC 


AGCTGTTTTG 


AAAATAATGT 


CATATTTGTC 


TTGATCAAGA 


AAACCATCTT 


9060 




GTGCACGTCC 


CATAATGAGC 


GCACCTTTGA 


ATCCTAAATC 


ATTGATGCAA 


CGTTCGAATT 


9120 




CTCGCGCTGC 


GGCTTCAGGC 


TCATTGATAG 


GTAAAGTTGC 


AAAGCCTACA 


AAG^GATTGG 


9180 



55 



248 



EP 0 786 519 A2 



TCTGACCAAC CAAATTTGAA GGAGAACCAT TTCCATAAGA TAAGACTTGA ATTTGAACGT 93 00 

CTTGATTATT CATAAATTGG ATACGTTCAT CATGATGTGA TAATTCGTCG GCATTTGTAA 9360 

AACCTGTCTT TTTTTcAAGG CCTTCTAACA TTACTTTCAT CGGTACACCT TTAGGATCTG 9420 

CTGATATCGC ATTCATCGTT TCTTTTTGAA TATCTTCAAT GACATAATGT TCTTCAAACG 9480 

TAATACTTTT CATTTACTTC GCCTCCATAT TGTATTGCAT GTTTATTGCA TCTATTGCAG 9540 

10 

AAGCATTTTT TATATACCTC TAATTTCAAT GTTTGTAACA TAAAATTGAT CTACCAAGGC 9600 

ATCTCTCCAT CGCCATTAAT AAATGTACCT GTTGGGCCAT CTGCACCAAT CGTTGCTAAT 9660 

15 TGAATGATTG GCTTGATTCC TTCAGAAACG TGTTTGGAAT TATTACTAAA ATCACCAACT 9720 

AAATCAGTAT TTGTAGCGCC TGGATCAGCA GCATTGATTT GCATGTTAGG TAATCCTTTA 9780 

GCGTATTGTA GCGTTAGCAT TGTTACTGCC GATTTAGACG AACAATAAGC TAATGAATTC 9840 

20 

ACTTTAGATT CAGCTGTTTC GGGGTTTGTA ACCATTCCAA ATGAACCTAA ACCACTTGAT 9900 

ACGTTGACGA CAACAGGTTG TTCAGATTTT TCTAAGAGAG GGACGAATGT ATTCATCATT 9960 

CGTACGATAC CGAATACATT CGTTTGATAT ACTTCTTCAA CGTCACGAGG TGTCAATTTG 10020 

25 

GAAGGTGCTG AAAATTGACC AGATATACCT GCATTGTTAA TGAGGATATC AAGACGGCCT 10080 

TCTTTTTCAG CAATCATGTT ATAAGCATTT TTGACTGAGT AGTCACTTGT AACATCTAAT 1014 0 

3Q TGTACATAAT GAACACCTAA TTTTTGTGAT GCTTGTTGTC CTCTTACATC ATTCCGAGAA 10200 

CCTATATAAA CTTTGTAACC CAATGCTTTA AGTGCCTCTG CACTTGCATA GCCTAACCCT 10260 

TTATTGCCTC CTGTGATTAA CACAATTTTA GTCATTACGT CCCACCTCAT CTAAATAAAT 10320 

35 GTTTAATAAA TAATTTCTGT ACGCTTCAAT TGAAATATGG CGATGCTCTA TTTGGAAGGC 10380 

AAATACACTA GTTGATAATG ATTGCAACAG CATATCTGTT TTGAAtTCGT GTAAGTGTCG 1044 0 

TCATCGCTTT TAAATAAGTC ATAATAAAAA TCAAATAATT CTTGATAAAA TGCGCTTTGG 10500 

40 

TAAAAACGTA ATTTATTGTT GCCTGCTTCA ATACATTGCA GTAGTGCCTT ATTATCGATT 10560 

TTAAATTGTA AAAGATAATC TAACGACACT TGCATAACCT CATAATTAGA ATGATAGTCA 10620 

TCTTTAATTT GCTTAAAATG AGTGATAAAA ATATCAAGGT CTCTTTGTAT GACGTAGTAG 10680 

CATAAATCGC TTTTATCTTT GAAATGTCGA TACAATGTCC CCATACCGAT ACCTAGTTCT 10740 

TTAGCAATAC GATTCATACT AATGTTTTCA ACGCCTTCTT CATCAAAAAG TTTGTGCGCT 10800 

50 ATTTCTTCAA TTCGTTGCCT ATTCTCTTTT GCATCTTTTC GCATGATTAC ACCTACTTAA 1086 0 

AATTCTCTAA AATTGACAAA CGGATAACTC TCCGTTTATT ATAAAACGTG TTAAGAAAGT 10920 

TAGCAATGAA TTTGCAATAA CTATTAAATA TCATAAAAGA AAAGAGTGTT GATAATGTCT 10980 
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ACCTTATCGG TTCAAATGAT TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 11100 

GTATTAGCTA TGACAGCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 11160 

TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACGA 11220 

GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 11271 
(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6261 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
75 (D) TOPOLOGY : linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

20 CAACCCGTTC AGAACAAAAT AAAAACCGTA CAATTTTATC ATCTTAATGA TTATTGTACG 60 

GAAAAACTTT TTTACATCAT ATCTGCATGT GCATAATCGA TATCGGTAAA TTTATTATAT 120 

TGTTTCATAA AATGTAACTT AACTGTGCCT GTTGGACCGT TACGTTGCTT AGCAATGATA 180 

ATTTCAATTT CACCGTTTTC ATCATTCGTT TGTGGCTCGA AACCACCATC ATCGTCATCA 240 

TCTTCATCGC CGCCACGGTT ATAGTAATCA TCACGGTATA AGAATGCAAC GATATCGGCA 300 

TCTTGCTCAA TCGAACCAGA TTCACGAATA TCACTCATCA TTGGACGTTT ATCTTGTCGT 360 

TGTTCAACAC CACGAGATAA CTGACTTAAT GCGATAACTG GACATTTTAA TTCACGGGCT 420 

AATGCTTTTA ATGTACGAGA GATTTCAGAA ACTTCCTGTT GTCTGTTATC GGACGCACGT 4 80 

35 GAACCACTAC CTTGAATCAA CTGTAAGTAG TCAATCACAA TCATGTCTAA GCCATGTTCT 54 0 

TGCTTTAATC GACGACATTT AGAACGTAAA TCATTAATTC GAATACCCGG TGTATCATCA 600 

ATAAAAATCT TCGTACGTGA TAATTTACCT ACCGCTATAG TAAAACGACT CCAATCTTCC 660 

TCAGTCATAG TACCCGTTCT TAAGCGGTTT GAGTCAACAT TTCCAGAACT ACAAATCATA 720 

CGTGTGGCTA ACTGATCAGC ACCCATCTCT AGCGAGAAAA TACCAACTGT ATACATATCT 78 0 

TCATGCGTTG CAACTTTTTG TGCAATATTA AGTGCGAACG CAGTCTTACC TACAGATGGA 84 0 

CGCGCTGCAA GGATAATTAA ATCATTTCGG TTGAACCCTG CTGTCATTTG GTCTAAATCT 900 

CGATATCCTG TAGGTATACC TGGTGTTTGA CCACTATTTT GATCAAGCTC TTCAGCTGTT 960 

TCATACACTT GTCCTAAGAC GTCTCGAATG TCTTTAAAGC CATCGCTTTC ACGAGAAGAT 1020 

GATAGCTCTA AAATTCGACG TTCTGCATCA CTTAAAATCG CATCTAGTTC AAGTTCATCA 1080 

TTATATCCAT CATTGGCAAT ACTATCTGCA GTTTGAATCA ATCTACGTTT TAATGCATGC 1140 



55 



250 



EP0 786 519 A2 



10 



TCTGCAAGAT ATTGCGGGCC ACCCGCTTcA TTCAACGTAC CTTCCGTCGA TAATTGATCC 1260 

ATCAATGTTA CAACATCAAT TTCTTTATTA TCTTCATTTA AGTGCATCAT TGCACGGAAA 1320 

ATATGTTGAT GGGCACCCCT ATAAAACGAC TCAGGAAGCA AAACTTCCTG AGTAGTATTA 1380 

ATCAATTCTG GATCTATAAT AATTGAACCT AAGACAGACT GTTCAGCTTC ATTGTTATGC 1440 

GGCATTTGAT TTTGCTCATA CATTCTATCC ATGAATGGTT ACACCTCTTA TTTCAATCCA 1500 

ACTTTATTGT TCAACTGTGT GTACGCGAAT TGTACCTTCA ACTTCTTTAT CTAATTTAAC 1560 

AGGTACATTC GTATATCCTA GGGAATGAAT TCCATTTGGT AAATCCATTT TACGTTTATC 1620 

15 AATTTTAATA TCATGTTGTG CTTTTAGTGC TTCGGCAATT TGTTTTGTAC TTACTGACCC 1680 

AAACAATTTA CCACCTTCAC CAGTTTTTGC TGaTACTTCA ACTTCAATGT TTGATAACGT 1740 

TTCTTTTAAT GCTTTAgCAT CTTCAATTTC TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 1800 

CTGTAACTCT AATTGTTTAA GGTTACCTGG TGTTGCTTCT ACAGCATAAT TCTTTTTCAA 1860 

TAAGAAGTTA TTTGCATAAC CTACTGGTAC TTCTTTAACT TCACCTTTTT TACCTTTACC 1920 

TTTACCTTTA ACATCTTGTG TAAAAATTAC TTTCATGCAT CTTCACTCCT ACTTAATTGT 1980 

TCTGTAATTG CTTGTTGTAA TTGTGCTATC GCCTCTTCGA CTGTCACACC TTTAAGTTGT 2040 

GTTGCCGCAT TGGTTAAATG TCCACCGCCA CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

ACTGAACCGA GTGAACGCGC AGATATACCA ATCAGATTAT CTTCACGTCT CGCAACAACA 2160 

TATGATGCTT CAATACCTTC TAAACTTAAC AGTTCATCTG CTGCTTGTGC AACTGTTACT 2 220 

GGATGATAAA TTTTATCGTC TGAACCATGC GcAATGGCTA TGCCATTATC TTCAACTTTT 2280 

35 ACAGTTCGAA TTAATTCAGA TCGATTAATG TAAGTATCCA CATCATCTTT TAAGAAATGT 234 0 

TGCGTTAAAA TCGTATCTGC ACCATGTGCA CGTAAATAAC TCGCTGCATC GAATGTTCTT 2400 

GATCCTGTTC GTAATGTAAA GTTTCTTGTA TCTACAATAA TACCTGCATA CATCACTGTT 2460 

40 

GATTCAAGAC GTGTTAAACG TTGTTCTGTT GGTTGATATT CCAGTAACTC TGTTACCAAT 2520 

TCAGCTGTCG AACTTGCGTA TGGTTCCATA TATATCAACA ATGGATTAGA GATGAAGCTT 2 580 

TCACCACGTC TATGATGATC GATAACAACT TTACGGTTTG CTTTATTTAA GACATTTTCA 264 0 

45 

TCTAAAACCA GTTCCGGTTT ATGCGTATCA ACAATCACTA CGGTTGTCTT AGATGTCATC 2700 

ATATCCCAAG CATCATCTGA TGTAATAAAT CGCTCTCTTA ACTCTGGCTT TTTATCTATT 2760 

so TCGTTCATCA CGCGTCGTAA TGTTGGATCA ATGTCAGTCT CATTTAATAC GATGTATGCT 2820 

TCTAAATTAT TCATCATTGC AAATCTAGAC ACACCGATTG CTGCACCAAT TGCATCTAAG 2880 

TCAGGACGTT TATGTCCCAT GATAATGACT TTGTCACCCT CTGCAAGGAT ATCTTTTAAC 294 0 
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CCATAGAAAC GCACATTACC ATTAATACTT 
AATGCTAAGT CTAGGCCTGA TTGTGATAAT 
5 TCACCAACAC CGATACTTAA TGTTAATTGG 

TGACTCAAGA TATCAAATTT AGATTCTTCT 
GCTACGAATT GATCGGAACT GTATCTTTTG 

10 

CTAATGACAC GCGTTACCAT TGAGTTGATT 
GTAATCTCAT CGTAGTTATC TAAAAATAAT 

15 AGTTCATTTG TTTGTACTTG TTCAGTTATA 
GAATAACGTA CTTGGAAATG ATACTGATTA 
AATTGCTTTA AAATGTTTGG AAATACTTCA 

20 ATATGATCTG TCATAAATTG GTTAACCCAT 
ATACCAATTG GTAAATGTTT GATTGCTTTA 
CCATCTACAT AACTATCCAT TTTCATTAAA 

25 ATCATCACGA CAAGAACGAT AGATGCAATT 
ACACCCATTA AAACAATTGC TGTGATGATC 
TTAGTGGACT GCCGATTCAT TATTCCACCT 

30 

TTCGCTTCAA ATTCAAACTT AAATCGATAA 
GTGTCAGTAT TGTACCGATA AC CAATAGT A 
CTTTACCAAA GAAATGAATA ACACTTAAAC 

35 

GTTGGAAGTT TAAAAGAATG CTCTGGAACA 
TGATAACAAT AATGTATATC CATAATAAAA 
40 TAAATACAGG TGT AG CG ATT TTAAATTTTC 
TTAAGACGAT TAAAAATGTA ATGATAATGA 

TAAACCCTTC TTCTAATATT TGGGTCATAT 

45 

CATGTAATGT TTGCTTGAAA GGTTTTACTA 

TTTGTAGTAA CATAAAAGCG ATTAATGAAA 

ATATTCTTTC TTTAGAGGTT CTTTCTTTGA 

SO 

AGACTAATAT GATGGCACTT AAAACGAAAG 
TAATAAGTGC ACTAATCCCG AAAGATTGTA 

55 



TTAATTGCAA CTTGGTCGCC ACCGCGTCCT 3060 

TCACCTAAGT CGATTAAATT TTCAGTACCT 3120 

GCACGATAAC CAACACTTTT TTCACGTAAT 3180 

AAGTCAGCTA ATATTTTTTG ATTTAAATAG 324 0 

AAAAATATAT TATACTCAGT TGCCCATCGA 3300 

TCCGAACGCT GCGTATCATT CATATTTTGC 3360 

GTCGCAATGA TTGGTTTAGA ATTTTCATAT 3420 

TCAAAGAAAT AGAGGCAGTG ATCATTCTCA 3480 

TATTCTATTT cAACGGATTT CACTCTATCT 3 54 0 

TTTACAGATT CAGAAATGAC ATTCGCTTCC 3600 

TCGATGTGAT CATTTTCATC TAAAACAATG 3660 

TTATTTGTTG TTGAAATTTG AGCACTCAAA 3720 

GCTTGTCTGA ATAAAATGAT GCTAACAATA 3780 

AGTGCTATAA G ACT ATT AAA GATAAACCAT 384 0 

ATGATGACAA ATGGTATTAG TAAAGCTTTC 3900 

CTATTCACTT TTTAGAATTA TTTTTCATGA 3960 

CACCAAGTAG TCCTACAATA TGTGTCGTAG 4 020 

AAATCGTTAC TGCATTCGGC AAACCTTTCG 4 080 

CTTGAATATA CATTACTAAT GATAACACAA 4140 

CACTCGGTTG ACCTGTAAAT AATAAACATA 4200 

TACCGCTCAT TTGCCACGCG AAAAGTGGCT 4260 

GTAAAATCGG AAATGTAACG ATTAAGTTAA 4320 

TGAAACCTGG TAATTGAACG GTCGCTTGTC 4380 

TCGCATCGGC ACCGCTCATC GTAATCGCTT 444 0 

TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4 500 

TTnArCTCAT CGCTACTGTT GTTACGTATA 4 560 

GCAATTGACC AATAATTAAA CTTGCAATTA 4 620 

TATTACCTAA AACAGTTGTT ATAATTACTG 4 680 

TTGATTTATT CCATAAAACG ATACCTGGTA 4740 
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CAAATACCAA 


CGCAATCGTT 


GCAATTATTG 


TTGCTTTAGG 


TTGTATTTTT 


GAAAACACAT 


4860 


AAGCCACTCC 


CATATTTTTA 


ACTATAGCTA 


TTATTTTAAC 


CTCTTTAATG 


AAAATTAACA 


4920 


ATTTATAGAT 


TGTATGCTTC 


TATTT CATTT 


AATTGAATAA 


TAACTTTCAT 


GTTTTATAAG 


4980 


TAATTAACAT 


ACTCATTTGA 


ATCGCTTTTG 


TGTGCTTTCA 


TTTTCAACAT 


GATTATTTAA 


5040 


TCCCACTACA 


TAG CAATC AA 


GCTTGATTTA 


GATTTACAAT 


ACATTTCCAC 


TCTCATGTAC 


5100 


TCTAGATGTT 


TTTGAATATG 


ATAACTGTGA 


TTTAGTGGCT 


TCATTCTTTG 


AAAATATATA 


5160 


TTATTACTTA 


CGCTTAAAAT 


GCTTTAAATT 


TAAGAAATGA 


TATAAGTTAG 


GTGCCCAGGT 


5220 


ACTAAAGTTT 


AGTAGGaATC 


CATCATGCCC 


AACATTATCA 


GGCACGAAGA 


AATGACGATG 


5280 


ATATTTAAAA 


CGTTCACCTA 


ATGCACGAAC 


TTGATCATCC 


GGATATAGCA 


AATCATCTAT 


5340 


GAACCCCATC 


GTTAACACTT 


TTGTTTCTAA 


ATTTTTAAAA 


ACATGCGTTA 


CGTCTGTGCG 


5400 


ACCTCGGTCA 


ATGTTGTGAC 


TATCCAATAC 


ATCTAGCAGT 


GTCAGATAAC 


AATTCAAATC 


5460 


AAAATGTTCT 


TTAAATTTAT 


TACCTTGATG 


TTGTTGGTAT 


GCGACTACTT 


CATCCGGCGT 


5520 


AAAACGTTCA 


TCATAACTTT 


TTGATGATCG 


ATATGTCAAA 


AAACCTAATT 


GGCGTGCAAT 


5580 


ACTTAGACCT 


TCCTTACCAC 


CAAGATGAAT 


GGCTTGCCTT 


GCAATTTCAT 


TGAAAGCTCT 


5640 


ACTATAAGAT 


GATGTTCGAC 


TTGTTGCAGC 


AAGGATAATG 


GCTTTATCTA 


CTTCAAACTG 


5700 


TTGATTGTAG 


AGTAGTTCCA 


TTGCTTGCAT 


ACCTGCAAGA 


CTTCCCCCTA 


TTAAAATATT 


5760 


AATCTTATCA 


TAACCAAGGG 


CTTGTATACC 


TCGTTCATTC 


GCTCTGACTA 


TATCTCTTAA 


5820 


TGTTAATTTT 


TTAGGAAAAT 


GAGGGTCGTT 


TAAAGGTGAA 


CTTGAACCGA 


AAGGACTACC 


5880 


AATAACATCA 


AATGTTAAAA 


ATTGATAATC 


GTGAATGGGT 


ATATATCCCC 


CATCAATAAT 


5940 


TTCTCGCCAC 


CAACCCGGAT 


AATCATCTGT 


TCCATATGTT 


AAATGATTGC 


CAGTTAATGC 


6000 


ATGACAAACT 


ACAACTAATG 


GTTGTCCATG 


ATAACCGACA 


TGCTCATATC 


TCAAACGCAA 


6060 


GTnATCTATG 


ACTTCCCCAG 


ATTCTGTAAT 


AAATTCCCCT 


AAATTTAAAG 


TATCTACTGT 


6120 


GTAATTTGTC 


ATTGTTCTTT 


CCTCCTTAAA 


CAAAAAAACT 


TCTCACCCTA 


TTGAAAAGTA 


6180 


AGAAGTCTTT 


ATACTTATCA 


TTCGAGTAAC 


TCGTTGGTTT 


TAGCACCGTG 


CTATAAAGTC 


6240 


GGTTGCTGAA 


GTATCACAGG 


G 








6261 


(2) INFORMATION FOR SEQ ID NO: 13: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 1222 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



5 



10 



15 



30 



35 



ATGCGATTAA 


CTCTGGAAAT 


ATCTTTTCCA 


TATTTACGTn 


TTAAATTATT 


CAGCAAATTC 


60 


ATACGAGaTT 


CATACTCGTT 


yAACACTTGT 


TCGTCGAATT 


CTGTATTAGC 


CATTTCATCA 


120 


TATAACTCAT 


GTTTTGCATC 


TTCTAAAATG 


TAGTAAAATT 


GATCAATATC 


TTCTTTTAAT 


180 


TTGTCATATT 


TGTTTGGAAC 


TATATCGTTT 


ATTGTTAACA 


AATGGTTGCT 


TAGTTCATAT 


240 


AAACGATCAG 


TGATAGCATT 


TTCATCCGTT 


AATGTCATAT 


ATGCGTTATT 


AAGCGCTAAG 


300 


CTTAATTTTT 


CAGAGTTTTG 


AATGCGTTTA 


ATATCTATTT 


CAAGTTGCTC 


TATTTCGCCT 


360 


TCTTTTAGAT 


GTGCTTCAGA 


CAATTCTTCT 


AATTGGAATT 


TCATTAAATC 


TAAACGCTGT 


420 


AGCAATGCTT 


GGTCTGCTGA 


TTCTAAATCT 


TCTAACTCTT 


GCTTTTTGGC 


TTTATAATTT 


480 


TGAAAAGTTT 


GGTGATATTT 


ATCCAACAAA 


TCTTGATAAC 


GTGATTCTGC 


GTAATTATCC 


540 


AATAATGTTA 


AATGGTATTT 


TTGTTTCAAC 


AAAGACTGCG 


TTTCATGTTG 


GCCATGAATA 


600 


TCTAATAATT 


CTTGCATAAC 


TTTTCGTAAA 


TCTTGTAAAG 


TAACTGTTTG 


ATTATTAATT 


660 


TTACAAAGAC 


TTTTACCAGA 


GCTGAAAATT 


TCCCGTTTAA 


CTAATAAAAA 


ATCTTCATCT 


720 


ACATCAATAT 


CCATATTTTT 


CAATATATGT 


ATAGCATCTT 


TACTCTCGTC 


AATATCAAAT 


780 


ATACCTTCGA 


TGACAGCCTT 


TTTTTCACCA 


TGTCTTACAA 


AATCAGATGA 


AGCTCTCATT 


840 


CCAATTAATT 


GTCCAATTGC 


ATCTATAATA 


ATTGACTTAC 


CTGAACCCGT 


TTCACCACTT 


900 


AAAACAGTTA 


AACCATCAGA 


AAATTGAATT 


TCTAATTCTT 


CAATAATAGC 


AAATTGCTTG 


960 


ATTGATAAGG 


TTTGTAACAT 


AAACTCATCG 


CATCCTTATA 


ACAAATTGAA 


AATTCTTGAC 


1020 


TTGATTTCAT 


CACTTGCCTC 


TTTGCTTCGA 


CAAATAATTA AACAAGTATC 


ATCACCACAA 


1080 


ATTGTGCCTA 


GTACTTCTTC 


CCAATTGATT 


TGGTCTAATA 


TAGCTCCAAT 


AGATTGTGCA 


1140 


TTACeAGGTA 


TGTTTTTAGA 


ACAAGTAAAT 


TATCAGTACC 


ATCTATATTA 


ACAAAGGAAT 


1200 


CCATTAAATA 


ACGTCCCAAT 


TT 








1222 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 14: 
TTTGTTATTA TTACnTnAAA TAATTGCATT ACTTTTTACT GATGGTACAA CTTTCCATCC 60 

55 
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10 



1S 



TTCTTTTGGC ACGACATAAT TGTCTTTATC TTGAACTAAA TATCCGCCAG ATACTGAAAC 180 

AAACTCTTCT TCGTTACTGT CTATAGTCAT ATCAATTTCT AATAATCTTA CATTCTTCTT 240 

TTGTTTTAAA ATATCTAATG CTTCATCTGT AAATTTTGGT GCAATAATGA CTTCCAAAAA 300 

GATACTATGC AATTGCTCTG CTAACTCAGG TGTTACAGCT CGGTTTAATG CAACAATTCC 360 

ACCAAATATT GATTGACTAT CCGCTTCATA CGCATGTTGA AATGCTTGTT CTATCGTGTC 420 

ACCGATACCA ACACCACATG GATTCATGTG TTTAACCGCA ACTGTAGCAG GTGTATCAAA 480 

CTTTTTAACT AAAGCTAGTG TAGCATCTGC ATCTTTAATA TTGTTATAGC TTAATTGTTT 540 

CCCATGTAAT TGTTTAGCGC CTGCAATCGT GTGCTTAGCA TTCGAAGTTC TCACAAAATA 600 

CGCTGATTGT TGTGGATTTT CTCCATATCT TAAAGTTTCT TTATCCCCTT TAAAGAAACG 660 

TACAATCGCT TCATCATATT CTGCAGTATG CTCAAAAACT TTAATCATTA ATGATTGTCT 720 

20 ATATGACTCA TCTAACGAAT CGTTTCTTAA TCGCGTCAAT ACTTCTTGAT AATCTGCCGG 780 

ATGTACAATT GTTGTTACAT GTTTATAGTT TTTAGCTGCA GCACGTAACA TTGTTGGACC 840 

ACCAATATCA ATATTTTCAA TTGCTTCGTC CATCGTCACA TCAGGGTTTG CAACAGTTTG 900 

TTGGAATGGA TATAAATTAA CTACTACCAT ATCAATTAAA TCTATATGTT GTTCTGATAA 960 

TTCATTTAAA TGCTGCGGTT TATTTCGATC AGCTAAAATG CCACCATGAA CAGCCGGATG 1020 

T 1021 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3759 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCATTCACTC CTAAATTGTT ATTACACTAT TACACaTAGC TAATCATCAA TGTGAAATCA 60 

CCTTCAAAGA CACTATCCAA ATCTTCAGAA GTCAAAATAA AGTTTGTACC AGTAGTCAGT 120 

TTGAAAATTT CACCATCGAC AATCATTTGC CCTTCGCCTT CCAACACTGT AACTAAACAG 180 

AACTCTCTAG GCTTCATATA ATTTAACGTG CCAGAAATTT CCCATTTAAC CAATGTAAAG 24 0 

AAATCATTCG ATACAATGTG TGTACACTTA TGGTTTTCAA TAATTTCGCT TTCAGGCAAA 3 00 

ATATTAGGTA ATGGTGCATT GTACTGAATA ACGTCTAAAG CTTTTTCAAT ATTTAACGGT 360 

CTATCATTAT ATTGATTATC TTGACGATTG AAATCATAAA GTCTATATGT AATGTCTGAC 420 

55 



25 



30 



45 
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ATAAAAtAGa ATTCyCCAGG kTTTACtTTA AtatATCyAA gTAtCGaCtC 


tATCGTTCCG 






TCTTn A A C AT 


GATTCGCAAC 


TTCTTCTCTA 


GACTCTGCTA 


ATGTCCCtAT 


AACTATTTCT 


OvU 


5 


GCATCTTCTT 


CTGCATCTAT AATATACCAA CATTCAGATT TGCCATATTG CCCgTTTTCA 


DO U 




TGCTCATAAG 


CATAAGAATT 


ATCAGGGTGC 


ACATGAATAG 


AAAGTGATTC 


TCTTGCATCC 


/ * u 


10 


ACTATTTTAG 


TTAGAAGCGG 


AAAATCTTTG 


CTTGGGAAAT 


CACCAAACAA 


TTCACGATGT 


Ton 


TCTGACCAAA 


TACGGTCTAA 


TGTTTGACCT 


TGATATGGTC 


CATTAATAAT 


CTCGCTCGTA 


OA A 

84 □ 




CCATTTGGAT 


GTGCTGACAC 


ACACCAACAT 


TCCCCCAGTT 


GTATCATTGT 


CTAATTGATA 


AAA 

y u u 


15 


TCCAAACTCA CTTAGACGTT GACCGCCCCA TAATTTTGTT TTTAAAATTG GTTGTAAAAA 


O ^ A 




TAATGGCATT 


GTTGCACCTC 


CATTGTGATT 


AAGTAAGCAA 


TAGAACTCTG 


n 1 o X lol i \j 1 


1020 




TCCATTATAT 


TTTGATTTTG 


TTCTCATTTA 


CATCGTATTA 


TTAACTTCCA 


fiTT^rf A A AT 


1080 


20 


TAACTATTAG 


TGATTGTACC 


ATATTTACTA 


ACATTGCAGT 


ACTGCCAATT 




1140 




CACTTAAATT 


TACAGTACTT 


lAAUll 1 1 It 


AAAAATTTAT 


AGCATAGAGA 


TTATATPTCT 


1200 




CTTACATTTG 


TACATATTTC 




TACTCGCCCA 


TTATACCAAT 


TAATAaACAA 


1260 


25 


CTTTAATAGT 


TGTGCCATAC 


a •wr* r r r rr* & & & 


TTCTTTGTAA 


AACGCATAGA 


CAATACGTAC 


1320 




TTATTCATAC 


TTATAATTCA 


TT* Ti ' i " i ~ i " 1 'f* A A 
1 l_*rt 11X1 V_^rtM 


AAAATAACGA 


GTTACGAAAA 


AGTAACCCGC 


1380 


30 


TTCAAATCAT 


ATTTACT AT C 


fTTl IT* A JiTp 
\_. 1 1 >i 1 1 .rrrt 1 ^- 


CGTTTCATTT 


TCAAATTGAG 


TTAAAGCATC 


1440 


TTTAATGTCC 


TGATCACCAC 


TAATAATTTG 


AAACTCTTGG 


TGATTAAAAT 


GATTGGATGT 


1500 




GACAATTTCT 


TTTAATACTG 


TCGCAACATC 


TTCTCTAGGA ATTTCACCTT 


TACCATCAAA 


1560 


35 


ATATTGTGCA 


GCTTCTATCT 


TTCCAGATCC 


TGCTGCATTT 


GTAAGTGCCC 


CTGGATGTAA 


1620 




AATTGTATAA 


TTCAAACCTG 


nAACGTCTTA 


AATAGTCATC 


AGCGTAATGT 


TTAGCTATTG 


1 CQf\ 

lfaoU 




TATATGGCTT TAAATCACCG 


CTATCATCAA 


V 

AAGCCTGACG 


TCTCGAATCA 


TATGTTGAAA 


i lift 


40 


CCATGACATA 


GTGTTTAATA 


TTGGCCTCTT 


TACTCGCAAT 


CATTGATTTA 


ACAGCACCAT 


i (inn 




CTAAATCGAC 


AATAATTGTT 


TTATCTGCAC 


CCGTGTTCCC 


TCCAGAACCT 


ACTGAAAAGA 


i Ren 




TAACTTTATC 


GAATGGTTTA 


AACGTCTCAG 


TTAAAGTCTC 


TATTGAATCA 


TTTTCAACAT 




45 


CAACAAGAAT 


TGCTTTCATA 


CCTTGTGATT 


TTAACGCATT 


AAGTTGATCT 


GATTGCCTAA 


1980 




CACCAGCAGT 


AAATGGTACA 


TTTTCTTTTG 


CTAATTGTTG 


CACTAGTAAC 


GAACCTACAC 


2040 


50 


CGCCATTAGC 


ACCTATAACC 


AAAATATTCA 


TTTACAACAC 


TCTCCTATkT 


ATTATTCTCT 


2100 


ATGCCATACC 


ACTTTATGAG 


ATATGTAAAA 


CTTGTTACAA 


CTATAAAAAT 


CAATTGACAT 


2160 




ACTACTGGGA ACGTATTAAA 


TTAATATATG 


AACAAATATT 


CATATGAAAG 


GATTGTCATA 


2220 
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10 



15 



tCaAGGCATT AGcGATTACA ATCGAATACG TATCaTGGAA TTGTTATCaG TCAGCGAAgC 2340 

AAGTGTTGGT CACATTtCAC ATCAATTGAA TTTATCTCAA TCAAATGTCT CGCACCAATT 2400 

AAAATTACTT AAAAGTGTGC ATCTTGTGAA AGCAAAACGA CAAGGCCAAT CAATGATTTA 2460 

TTCATTAGAT GACATCCACG TAGCAACTAT GTTAAAGCAA GCCATACATC ACGCGAATCA 2520 

TCCTAAAGAA AGTGGGTTAT AATATGTCTC ATTCACATCA TCATCATGAC CATATGCATA 2580 

GTCATGTAAC TACAAATAAT AAGAAAGTAT TGTTTATATC GTTTTTAATA ATCGGTCTAT 2640 

ATATGTTTAT CGAAATCATC GGCGGTCTCC TTGCTAACAG CTTGGCATTA CTATCTGACG 2700 

GTATCCATAT GTTTAGCGAC ACATTCTCAT TAGGTGTTGC ACTTGTCGCA TTTATTTATG 2760 

CTGAAAAGAA TGCCACAACT ACAAAAACAT TTGGTTATAA ACGTTTCGAA GTACTCGCAG 2820 

CGTTATTTAA CGGTGTAACG CTTTTTGTAA TAAGTATTTT GATTGTTTTT GAAGCGATTA 2880 

20 AACGTTTCTT TGTTCCTTCT GAAGTTCAAT CAAAAGAAAT GTTAATCATT AGTATTATCG 2940 

GTTTAATTGT CAATATCGTT GTTGCATTCT TTATGTTTAA AGGCGGCGAC ACTTCACACA 3000 

ATTTAAATAT GCGTGGTGCT TTTCTACATG TTATCGGAGA CTTATTAGGT TCAGTTGGCG 3060 

CCATTACTGC AGCTAkTTTA ATTTGGGCAT TTGGATGGAC AATCGCCGAT CCTATCGCAA 3120 

GTATTTTAGT TTCCGTTATT ATTTTAAAAA GTGCTTGGGG TATCACAAAA TCTTCAATTA 3180 

ACATTTTAAT GGaAGGCACA CCAAGTGATG TTGATATAGA TGAAGTTATA ACTACTATTA 3240 

AAAAGGATTC , ACGAATACAA AGTGTGCATG ATTGCCATGT TTGGACAATT TCAAATGATA 3300 

TGAATGCATT GAGTTGTCAT GTTGTTGTAG ACCATACATT GACAATGAAA GAATGTGAAT 3360 

TATTATTAGA AAaCATTGAG CATGATTTAT TACATTTAAA TATTCACCAT ATGACTATTC 3420 

AATTAGAAAC GCCTAATCAC AAACATGATG AATCGATTAT ATGTTCAGGA ACACATAGTC 34 80 

ATTCACATAA CCATCATGCT CATCATCACG CGCATGTACA TTAATAATTT TAACCTACTG 3540 

40 CCATTGCATC GATTAAACTT TTCAATGGCA GTAGGTTTTT TATGTCTTTA TGGCGACTTG 3600 

TTTGGTCTTT GATGATGCAA TGTTTATTAA CAAATTTTCA ACTATTATTT CTTACATTAG 3660 

TCATATTTTT GACAATTTAC TATTATAATT CTCTAACTTT AGTCACTTTA ATTAATTTTT 3720 

45 ATTAGATATT AATATGAAAA TAACGTGTTT TTTGTTATT 3759 

(2) INFORMATION FOR SEQ ID NO: 16: 

. (i) SEQUENCE CHARACTERISTICS :• 
so (A) LENGTH: 13086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 
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30 
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(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


16: 








TAATTATCGC GCATAACAAA ACATTAGCAG 


GACAATTATA 


TAGTGAGTTT 


AAAGAATTTT 


60 




TTCCTGAAAA 


CAGGGTGGAA 


TACTTTGTAA 


GTtACTATGA TTATTATCAn 


CCAGAGGCAT 


120 




ACGTACCGTC 


TACTGACACT 


TTTATTGAAA 


nAGATGCCTC 


AATCAnTGAT 


GAAATTGATC 


180 


10 


AACTACGACA TTCTGCTACA AGTGCATTAT 


TTGAACGCGA 


TGATGTAATT 


ATTATTGCTA 


240 


GTGTAAGTTG 


TATATATGGT 


TTAGGTAATC 


CTGAAGAATA 


TAAAGATTTA 


GTAGTAAGTG 


300 




TTCGAGTTGG 


TATGGAAATG 


GATAGAAGTG 


AATTACTTAG 


AAAACTTGTc 


AGATGTGCAA 


360 


15 


TATACACGAA ATGACATCgA TTTcCAACGA GGAACGTTTC GAGTGCGTGG 


TGATGTAGTG 


420 




GAAATATTCC 


CAGCCTCTAA 


AGAAGAACTT 


TGTATAAGGG 


TTGAGTTTTT 


CGGCGATGAG 


480 




ATTGACCGTA 


TCCGAGAAGT 


TAACTACCTA 


ACAGGTGAAG 


TGTTGAAAGA 


AAGAGAACAT 


540 


20 


TTTGCGATAT 


TCCCAGCTTC 


TCACTTCGTA 


ACACGTGAAG 


AAAAGTTGAA 


AGTTGCGATT 


600 




GAACGTATTG 


AAAAAGAATT 


GGAAGAACGA 


TTGAAAGAAT 


TACGAGATGA 


GAATAAATTA 


660 




CTAGAAGCGC 


AAAGGTTAGA 


ACAGCGTACC 


AACTATGATT TAGAAATGAT GCGAGAGATG 


720 


25 


GGATTCTGTT 


CAGGAATTGA 


AAACTATTCC 


GTACATTTAA 


CTTTGCGACC 


ACTGGGTTCG 


780 




ACACCATATA 


CTTTATTGGA 


TTACTTTGGC 


GATGATTGGT 


TAGTAATGAT 


TGATGAATCA 


840 




CATGTGACAT 


TACCGCAAGT 


TCGAGGCATG 


TATAACGGAG 


ACAGAGCGCG 


TAAACAAGTT 


900 


30 


TTGGTGGATC 


ATGGGTTTAG 


ATTAGCGAGT 


GCATTAGATA 


ACCGTCCACT 


TAAATTTGAA 


960 




GAATTTGAAG 


mAAAGACAAA 


ACAACTTGTG 


TATGTATCTG 


CAACGCCTGG 


ACCATACGAA 


1020 


35 


ATTGAACATA 


CGGATAAGAT 


GGTTGAACAA 


ATTATTCGTC 


CTACTGGTTT 


ACTGGATCCT 


1080 


AAGATTGAGG 


TTAGACCTAC 


TGAAAATCAA 


ATTGACGATT 


TATTAAGTGA 


AATTCAAACA 


1140 




AGAGTgAGCG TAATGAACGC 


GTACTTGTTA 


CAACGCTCAC 


TAAAAAGATG 


AGTGAAGATT 


1200 


40 


aACCACATAC 


ATGAAAGAaG 


CGGGTATTAA 


aGTtAATTAT 


CTGCATTCAG 


AAATCAAGAC 


1260 




ATTAGAACGA 


ATTGAAATAA 


TTAGAGACTT 


ACGAATGGGT 


ACATATGATG 


TTATCGTAGG 


1320 




TATTAATTTA 


TTAAGAGAGG 


GTATTGATAT 


ACCAGAAGTT 


TCTCTAGTTG 


TCATATTAGA 


1380 


45 


TGCAGATAAA 


GAAGGGTTTT 


TACGTTCTAA 


CCGCTCATTA 


ATTCAAaCAA TAGGTAGAgC 


1440 




TGCGCGTAAC 


GATAAaGGTG 


AAGTCATTAT 


GTATGCCGAT 


AAAATGACTG 


ATTCGATGAA 


1500 




GTATGCAATT 


GATGAGACAC 


AACGTCGTCG 


AGAAATACAG 


ATGAAACATA 


ATGAAAAACA 


1560 


50 


TGGTATTACA 


CCTAAAACAA 


TTAATAAAAA 


AATACATG AT . TTAATTAGTG 


CTACTGTTGA 


1620 




AAATGACGAA 


AATAATGACA 


AAGCACAAAC 


TGTGATACCT 


AAGAAGATGA 


CGAAAAAAGA 


1680 
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TTTCGAGAAA GCTACAGAAT TAAGAGATAT GTTATTTGAA TTAAAAGCAG AAGGGTGACA 1800 

AGTAAATGAA AGAACCATCC ATAGTAGTAA AAGGTGCTCG TGCGCATAAC TTGAAAGATA 1860 

5 TTGATATCGA ACTACCTAAA AaTAAATTAA TTGTTATGAC AGGTTTATCT GGGTCAGGTA 1920 

AATCGTCATT AGCATTCGAT ACTATATATG CTGAAGGACA ACGACGTTAT GTTGAATCAT 1980 

TAAGTGCCTA TGCGCGTCAA TTTTTAGGCC AAATGGACAA ACCAGATGTT GATACAATTG 2040 

10 

AAGGATTATC GCCAGCAATT TCAATAGATC AAAAAACAAC AAGTAAAAAT CCAAGATCAA 2100 

CTGTAGCAAC AGTAACAGAA ATATATGATT ATATACGTTT GTTATATGCA CGTGTTGGTA 2160 

AACCTTACTG TCCAAATCAC AATATAGAAA TTGAATCGCA AACAGTACAA CAAATGGTTG 2220 

15 

ACCGCATTAT GGAATTAGAG GCACGTACAA AGATTCAATT ATTAGCACCT GTCATCGCTC 2280 

ATCGTAAAGG TAGTCATGAA AAGCTAATCG AAGATATTGG TAAAAAAGGT TATGTACGTT 2340 

20 TAAGAATCGA TGGCGAAATT GTTGATGTAA ATGATGTACC TACTTTAGAT AAGAACAAGA 2400 

ATCATACAAT AGAAGTTGTT GTAGACCGAT TAGTTGTTAA AGATGGAATT GAAACACGAC 2460 

TAGCTGACTC TATAGAAACT GCCTTAGAGC TTTCAGAAGG ACAATTAACA GTCGATGTCA 2520 

25 TTGACGGGGA AGACCTTAAG TTTTCAGAAA GCCATGCTTG TCCTATATGT GGATTTTCAA 2580 

TCGGAGAGTT AGAACCAAGA ATGTTTAGCT TTAACAGTCC TTTTGGTGCT TGTCCGACAT 2640 

GTGATGGCTT AGGCCAAAAG TTAACAGTCG ATGTAGACTT GGTTGTTCCC GACAAAGATA 2700 

30 AGACGCTAAA CGAAGGTGCA ATAGAACCTT GGATACCGAC GAGTTCTGAT TTTTATCCAA 2760 

CATTGTTAAA ACGTGTTTGT GAAGTTTATA AAATCAATAT GGATAAACCT TTTAAAAAGT 2820 

TAACAGAACG TCAACGTGAT ATTTTATTGT ATGGTTCTGG TGACAAAGAA ATTGAATTTA 2880 

35 

CATTTACACA ACGTCAAGGT GGTACTAGAA AACGAACAAT GGTTTTCGAG GGTGTAGTTC 2940 

CTAATATAAG TAGACGATTC CATGAATCTC CTTCAGAATA TACACGTGAA ATGATGAGTA 3000 

AATATATGAC TGAACTACCT TGCGAAACTT GTCATGGAAA GCGATTGAGT CGTGAAGCkT 3 060 

40 

TATCTGTTTA TGTAGGTGGT TTAAATATTG GTGAAGTAGT CGAATATTCA ATCAGTCAAG 3120 

CGCTGAACTA TTATAAAAAC ATTGATTTGT CAGAACAAGA TCAAGCGATT GCAAATCAAA 3180 

45 TATTGAAAGA AATTATTTCC CGACTCACTT TTTTAAATAA TGTGGGACTT GAATATTTAA 3240 

CGTTAAACAG AGCTTCAGGT ACACTTTCAG GTGGTGAAGC ACAACGTATT CGATTAGCAA 3300 

CGCAAATTGG GTCGCGTTTG ACTGGTGTCT TATATGTATT AGATGAGCCA TCAATTGGAC 3360 

50 TGCATCAAAG AGATAATGAT CGATTAATTA ATACACTTAA AGAAATGAGA GATTTAGGAA 3420 

ATACTTTAAT TGTAGTTGAA CACGATGATG ATACAATGCG TGCGGCTGAT TACTTAGTGG 34 80 
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AGGTAATGAA AGATAAAAAA TCATTAACAG 


GACAATACTT 


GAGTGGTAAG 


AAACGTATTG 


3600 




AAGTACCTGA 


ATATCGCAGA 


CCGGCTTCAG 


ATCGTAAAAT 


TTCTATACGT 


GGAGCTAGAA 


3660 




GCAACAATCT 


TAAAGGGGTT 


GATGTGGACA 


TACCACTATC 


AATCATGACG 


GTTGTTACAG 


3720 




GTGTATCAGG 


TTCTGGTAAA 


AGCTCATTAG 


TAAATGAAGT 


ATTATACAAA 


TCATTAGCTC 


3780 


10 


AAAAAATTAA 


TAAATCTAAA 


GTAAAGCCAG 


GATTGTACGA 


TAAGATTGAA 


GGTATTGATC 


3840 


AACTTGATAA 


AATTATTGAT 


ATTGATCAAT 


CACCAATAGG 


TAGAACGCCA 


CGCTCTAATC 


3900 




CAGCAACATA TACTGGTGTG TTTGATGATA TACGTGATGT GTTTGCGCAA ACAAATGAAG 


3960 


1S 


CTAAAATTCG 


AGGATATCAA 


AAAGGGCGTT 


TTAGTTTTAA 


TGTAAAAGGT 


GGACGCTGTG 


4020 




AAgcTTGTAA AGGTGACGGT ATTATTAAAA TTGAAATGCA TTTTTTACCT GATGTTTATG 


4080 




TTCCTTGTGA 


AGTGTGTGAT 


GGTAAACGAT 


ATAATCGTGA 


GACACTAGAG 


GTTACTTACA 


4140 


20 


AAGGTAAAAA TATTGCTGAC ATTTTAGAAA TGACTGTTGA AGAAGCAACA CAATTTTTTG 


4200 




AAAATATTCC 


TAAGATTAAG 


CGCAAGTTAC 


AAACACTAGT 


TGATGTTGGT 


CTTGGATACG 


4260 




TCACATTAGG 


TCAACAAGCT 


ACAACGTTAT 


CAGGTGGTGA 


GGCTCAACGT 


GTGAaACTTG 


4320 


25 


CATCTGAACT 


TCATAAACGT 


TCAACTGGTA 


AATCTATTTA 


TATCCTAGAT 


GAACCGACAA 


4380 




CAGGGTTACA 


TGTTGACGAT 


ATTAGTAGAT 


TATTAAAAGT 


ATTAAACCGA 


TTAGTTGAAA 


4440 




ATGGTGATAC 


TGTTGTAATT 


ATTGAACATA 


ACCTAGATGT 


TATCAAAACA 


GCAGACTATA 


4500 


30 


TTATAGACTT 


AGGTCCTGAA 


GGTGGTAGTG 


GCGGTGGTAC 


TATTGTTGCG 


ACTGGCACAC 


4560 




CCGAAGATAT 


TGCTCAGACA 


AAGTCATCAT 


ATACAGGAAA 


GTATTTAAAA 


GAAGTACTTG . 


4620 


3S 


AACGAGATAA 


ACAAAATACT 


GAAGATAAAT 


AAGATTAAAA 


GAAGTGAAGG 


ATGTTATAAA 


4680 


TTTATCCTTC 


GCTTCTTTTT 


ATTAATTTAG 


TAATGAATAG 


TAGAAAGAAA 


AGATGCGTAA 


4740 




AAAGAATTAT GTTAAGATAG 


GGTCAATCTA 


GAGTAGTTAA 


ACATAAATCG 


AACTGGGAGT 


4800 


40 


GGGACAGAAA 


TGATAAAGAA 


TCACTAATGA 


TTTATTATGT 


AGTGGTTCTT 


TGTCATTAGC 


4860 




CACAGCTATT 


GTGTACTTAA AAATAGGaat 


GCaTgAGTGC AACTCATGCA TAAGaAATAC 


4920 




TAATTTCTAA 


AGAAAAAGTA 


TTTCTTTATG 


TTGGGGCCCC 


GCCAACTTGC 


ATTGTTTGTA 


4980 


45 


GAATTTCTTT 


TCGAAATTCT 


TTATGTTGGG 


GCCCCGCCAA 


CTTGCATTGT 


TTGTAGAATT 


5040 




TCTTTTCGAA 


ATTCTTTATG 


TTGGGGCCCC 


GCCAACTAAT 


TCCAATATAT 


CATTGTAGAG 


5100 




CTTAGGTCAT 


TGATTTTTGG 


CTCGGACTTT 


TATGGCGATA 


TGAACCATGT 


AAATTAAGCA 


5160 


50 


AGCAATAAAT 


TAATGATTGA 


TATTGACTTG 


TAAAATAATA 


ACAATAATGA 


ACAATTAATA 


5220 




TTTATTTTAG 


CTTTTCAATG 


TAGATTGGTG 


TTATATTTTT 


GATATGATAA 


GAAGAGATGT 


5280 
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ACATTAAAGT 


TAGATTTAAT 


CGCTGGTGAA 


GAAGGACTAT 


CGAAGCCAAT 


TAAAAATGCT 


5400 




GATATATCAA 


GACCGGGCTT 


AGAGATGGCA 


GGTTATTTTT 


CACATTATGC 


GTCAGATAGA 


5460 


5 


ATACAACTAT 


TAGGAACAAC 


GGAACTATCG 


TTTTACAATT 


TATTACCAGA 


TAAGGATCGC 


5520 




GCAGGTCGTA 


TGCGTAAACT 


ATGCAGACCA 


GAAACGCCTG 


CAATTATTGT 


GACACGTGGA 


5580 


10 


TTGCAGCCAC 


CAGAAGAATT 


AGTTGAAGCT 


GCAAAAGAAT 


TAAATACCCC 


ACTTATAGTT 


5640 


GCTAAAGATG 


CGACTACAAG 


TTTAATGAGT 


CGCTTAACAA 


CGTTTTTAGA 


GCATGCACTT 


5700 




GCAAAGACGA 


CATCTTTACA 


TGGTGTTTTA 


GTAGATGTTT ACGGTGTTGG 


TGTACTAATT 


5760 


15 


ACCGGTGATT 


CAGGAATAGG 


TAAAAGTGAG 


ACTGCGTTGG AATTAGTTAA 


ACGTGGGCAT 


5820 


AGATTAGTAG 


CAGATGATAA 


TGTAGAAATA 


CGTCAAATTA ATAAAGATGA 


ACTAATAGGG 


5880 




AAACCACCAA 


AGTTAATAGA 


ACATCTATTA 


GAAATACGTG 


GACTAGGTAT 


TATCAATGTT 


5940 


20 


ATGACTTTAT 


TTGGCGCGGG 


TTCAATATTA 


ACTGAAAAAC 


GAATTAGATT 


AAATATTAAT 


6000 




TTGGAAAACT 


GGAACAAGCA 


AAAGTTATAT 


GACCGCGTAG 


GTCTTAATGA 


AGAGACGCTA 


6060 




AGTATTTTAG 


ATACTGAAAT 


CACTAAAAAA 


ACAATACCTG 


TAAGACCTGG 


TAGAAATGTT 


6120 


25 


GCGGTAATTA 


TTGAGGTCGC 


TGCAATGAAC 


TATCGATTAA 


ATATCATGGG 


CATTAACACG 


6180 




GCCGAAGAAT 


TTAGTGAAAG 


ATTAAATGAA 


GAAATTATCA 


AGAACAGTCA 


TAAGAGTGAG 


6240 




GAGTAGGTTG AATGGGTATT GTATTTAACT ATATAGATCC TGTGGCATTT 


AACTTAGGAC 


6300 


30 


CACTGAGTGT 


ACGATGGTAT 


GGAATTATCA 


TTGCTGTCGG 


AATATTACTT 


GGTTACTTTG 


6360 




TTgCACAACG TGCACTAGTT AAAGCAGGAT 


TACATAAAGA 


TACTTTAGTA 


GATATTATTT 


6420 




TTTATAGTGC 


ACTATTTGGA 


TTTATCGCGG 


CACGAATCTA 


TTTTGTGATT 


TTCCAATGGC 


6480 


35 


CATATTACGC 


GGAAAATCCA 


AGTGAAATTA 


TTAAAATATG 


GCATGGTGGA 


ATAGCAATAC 


6540 




ATGGTGGTTT 


AATAGGTGGC 


TTTATTGCTG 


GTGTTATTGT 


ATGTAAAGTG 


AAAAATTTAA 


6600 


40 


ACCCATTTCA 


AATTGGTGAT 


ATCGTTGCGC 


CAAGTATAAT 


TTTAGCGCAA 


GGAATTGGAC 


6660 


GCTGGGGTAA 


CTTTATGAAT 


CACGAGGCAC 


ATGGTGGATC 


GGTGTCACGC 


GCTTTTTTAG 


6720 




AACAATTACA 


TTTGCCTAAT 


TTTATAATAG 


AAAATATGTA 


TATTAACGGC 


CAATATTATC 


6780 


45 


ATCCAACATT 


CTTATATGAA 


TCCATTTGGG 


ATGTCGCTGG 


ATTTATTATC 


TTAGTTAATA 


6840 




TTCGTAAACA 


TTTAAAATTA 


GGAGAAACAT 


TCTTTTTATA 


TTTAACTTGG 


TATTCAATTG 


6900 




GTCGATTCTT 


TATAGAAGGA 


TTACGTACAG 


ATAGCTTAAT 


GCTCACAAGT 


AATATTAGAG 


6960 


50 


TTGCACAATT 


AGTATCAATT 


CTTTTAATTT 


TAATAAGTAT 


AAGTTTAATT 


GTATATAGAA 


7020 




GGATTAAGTA 


TAATCCACCG 


TTGTATAGCA 


AAGTTGGGGC 


GCTTCCATGG 


CCAACAAAAA 


7080 
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TTATGGCGTG TATACCGTCT TGTTAAATTT TCGAAAGTTT TTAAGAATGT AATTATCATT 7200 

GAATTTTCGA AATTTATTCC AAGTATGGTA CTGAAAAGAC ATATATATAA ACAACTTTTA 7260 

5 AATATTAATA TCGGTAATCA ATCGTCGATA GCTTATAAAG TAATGTTAGA TATTTTTTAC 7320 

CCAGAACTGA TTACGATTGG TAGTAACAGT GTTATTGGTT ACAATGTAAC AATTTTGACG 7380 

CATGAAGCAT TAGTTGATGA ATTTCGTTAT GGACCAGTGA CGATAGGATC TAACACTTTG 7440 

10 

ATTGGTGCAA ATGCTACCAT TTTACCCGGT ATAACGATTG GTGACAATGT AAAAGTTGCA 7500 

GCTGGTACGG TTGTTTCAAA AGATATACCG GATAATGGAT TTGCATATGG CAACCCTATG 7560 

TATATAAAAA TGATTAGGAG GTGACAATTT TATGGCGCAA AAGAATAATA ATGTAATTCC 7620 

15 

AATGACTTTT GATGATGCAT TTTATCGTAA AATGGCTAAA CAGAAGTTTA AACAAAGAGA 7680 

ATATAAACGA GCTGCTGAAT ACTTTGAAAA AGTGTTAGAA TTGTCACCTG ATGATCTGGA 7740 

2Q AATTCAAATT GATTATGCAC AATGTCTAGT GCAACTTGGT ATTGCTAAAA AAGCAGAACA 7800 

TTTATTTTAT GACAATATTA TTTATAATAG GCATCTAGAA GATAGCTTTT ATGAATTGAG 7860 

TCAGCTCAAC ATTGAAGTTA ACGAACCAAA CAAGGCATTC TTGTTTGGTA TTAATTATGT 7920 

25 TATTGTTAGC GACGACCAAG ATTATAGAGA TGAATTAGAT CAAATGTTTG ATGTGAAATA 7980 

TCAAAGTGAA GAACAAATTG AACTTGAAGC TCAATTGTTT GTAGTTCAAA TACTATTCCA 8040 

ATATCTTTTT TCTCAAGGTC GATTAAAAGA TGCAAAGAAT TATGTCTTAC ATCAACCACA 8100 

30 AGAAGTTCAA GATCATCGTG TAGTACGTAA TTTATTGGCA ATGTGTTATT TATATCTCGG 8160 

TGAATATGAT ACgGCTAAAG CATTGTACGA aGCACtATTA CAAGAGGATA GTACaGATAT 8220 

ATATGCATTA TGCCATTATA CTTTGCTACT TTATAACACT AAGGAAAATG AACAATATCA 8280 

35 

AAAATATTTA AAAATATTAA ACAAAGTTGT ACCTATGAAT GACGATGAAA GTTTTAAATT 834 0 

AGGXATTGTA TTAAGTTATT TAAAGCAGTA TCGTGCATCA CAACAATTGT TGTACCCTTT 8400 

ATATAAAAAA GGGAAATTTT TATCAATTCA AATGTACAAT GCTTTAGCAT ATAATTATTA 8460 

40 

TTATTTAGGT GAAGAAGACG AAAGTCATTA CTACTGGGAT AAATTGAAGC AAATTTCTAA 8520 

AGTGGAAATT GGACATGCGC CTTGGGTAAT TGAAAATAGC AAAGAAGTTT TTGACCAACA 8580 

^ TATTTTGCCA TTACTTCAAA GTGATGACAG TCATTATCGT TTATATGGTA TTTTTTTATT 864 0 

GGATCAATTA AATGGTAAAG AAATTGTGAT GACGGAAAGT ATTTGGCAGG TTTTGGAAAA 8700 

TCTAAATAAT TATGAGAAAT TGTATTTAAC GTATTTAGTT CAAGGTTTAA CGCTCAATAA 8760 

SO ATTAGACTTC ATTCATCGCG GCTTATTAAC GCTTTACCAT AATGAATTAT TTGTAAGTGA 88 20 

AAATGATGTA ATGGTTGCAT GGATTAATCA AGGTGAACTC ATAATTGCTG AAAAAGTAGA 8880 
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TCGAAACGTT ACAAAGAAGC AAATTACAAC ATGGTTAGGC ATAACACAAT ATAAACTGAA 9000 

CAAAATGATT GAATTTCTCT TGAGCATATA GATTTATGAA AAGTTAGATT TATTATATAA 9060 

TGCGCATAAT GATTAATAAT GAGGAGGCGT TAATAAAATG ACTGAAATAG ATTTTGATAT 9120 

AGCAATTATC GGTGCAGGTC CAGCTGGTAT GACTGCTGCA GTATACGCAT CACG7GCTAA 9180 

TTTAAAAACA GTTATGATTG AAAGAGGTAT TCCAGGCGGT CAAATGGCTA ATACAGAAGA 9240 

AGTAGAGAAC TTCCCTGGTT TCGAAATGAT TACAGGTCCA GATTTATCTA CAAAAATGTT 9300 

TGAACACGCT AAAAAGTTTG GTGCAGTTTA TCAATATGGA GATATTAAAT CTGTAGAAGA 9360 

TAAAGGCGAA TATAAAGTGA TTAACTTTGG TAATAAAGAA TTAACAGCGA AAGCGGTTAT 9420 

TATTGCTACA GGTGCAGAAT ACAAGAAAAT TGGTGTTCCG GGTGAACAAG AACTTGGTGG 9480 

ACGCGGTGTA AGTTATTGTG CAGTATGTGA TGGTGCATTC TTTAAAAATA AACGCCTATT 954 0 

CGTTATCGGT GGTGGTGATT CAGCAGTAGA AGAGGGAACA TTCTTAACTA AATTTGCTGA 9600 

CAAAGTAACA ATCGTTCACC GTCGTGATGA GTTACGTGCA CAGCGTATTT TACAAGATAG 9660 

AGCATTCAAA AATGATAAAA TCGACTTTAT TTGGAGTCAT ACTTTGAAAT CAATTAATGA 9720 

AAAAGACGGC AAAGTGGGTT CTGTGACATT AACGTCTACA AAAGATGGTT CAGAAGAAAC 9780 

ACACGAGGCT GATGGTGTAT TCATCTATAT TGGTATGAAA CCATTAACAG CGCCATTTAA 9840 

AGACTTAGGT ATTACAAATG ATGTTGGTTA TATTGTAACA AAAGATGATA TGACAACATC 9900 

AGTACCAGGT ATTTTTGCAG CAGGAGATGT TCGCGACAAA GGTTTACGCC AAATTGTCAC 9960 

TGCTACTGGC GATGGTAGTA TTGCAGCGCA AAGTGCAGCG GAATATATTG AACATTTAAA 10020 

CGATCAAGCT TAATTCGAAG TCGAATTAAG ATGTTGAGCT GTAAATTATT TGGATATTTA 10080 

TTTTAATAGT GTCATCACAG CGTTAAAATA ATGTCTTACT TTTAAATTAA AGCAAATTAT 10140 

ATAGfiAAACT AGAACTTAGT ACGTATCATT TGTGCGTTTC AATGAGTTCT AGTTTTTTTA 10200 

TATGTTATAT TAAACTTATA ACTTTATGGG AGTGGGACAG AAATGATAAA GAGCCACTAA 10260 

TGATTTATTA TGTAGTGGTT CTTAAACATT AGCCACAGCT AATGTGTACT TAAAAATAGG 10320 

AATACATGAG TAAAACTCAT GCATAAGAAA TACTAATTTC TATAGAAAAA GTATTACTTT 103 80 

ATCGTTGTCC CACCCCAACT TGCACATTAT TGTAAGCTGA CTTTCCGCCA GCTTCTGTGT 10440 

TGGGGCCCCG CCAACTTGCA CATTATTGTA AGCTGACTTT TCGTCAgCTT CTGTGTTGGG 10500 

GCCCCGCCAA CTTGCACATT ATTGTAAGCT GACTTTTCGT CAGCTTCTGT GTTGGGGCCC 10560 

CGCCAACTTG CATTGTCTGT AGAAATTGGG AATCCAATTT CTCTATGTTG GGGCCCACAC 10620 

CCCAACTCGC ATTGCCTGTA GAATTTCTTT TCGAAATTCT CTGTGTTGGG GCCCACACCC 10680 
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ACTCGCATTG CCTGTAGAAT TTCTTTTCGA AATTCTCTGT GTTGGGGCCC CTGACTAGAG 10800 
TTGAAAAAAG CTTGTTGCAA GCGCATTTTC ATTCAGTCAA CTACTAGCAA TATAATATTA 10860 
TAGACCCTAG GACATTGATT TATGTCCCAA GCTCCTTTTA AATGATGTAT ATTTTTAGAA 10920 
ATTTAATCTA GACATAGTTG GAAATAAATA TAAAACATCG TTGCTTAATT TTGTCATAGA 109 BO 
ACATTTAAAT TAACATCATG AAATTCGTTT TGGCGGTGAA AAAATAATGG ATAATAATGA 11040 
AAAAGAAAAA AGTAAAAGTG AACTATTAGT TGTAACAGGT TTATCTGGCG CAGGTAAATC 11100 
TTTGGTTATT CAATGTTTAG AAGACATGGG ATATTTTTGT GTAGATAATC TACCACCAGT 11160 

GTTATTGCCT AAATTTGTAG AGTTGATGGA ACAAGGAAAT CCATCCTTAA GAAAAGTGGC 11220 

AATTGCAATT GATTTAAGAG GTAAGGAACT ATTTAATTCA TTAGTTGCAG TAGTGGATAA 11280 

AGTCAAAAGT GAAAGTGACG TCATCATTGA TGTTATGTTT TTAGAAGCAA GTACTGAAAA 1134 0 

ATTAATTTCA AGATATAAGG AAACGCGTCG TGCACATCCT TTGATGGAAC AAGGTAAAAG 11400 

ATCGTTAATC AATGCAATTA ATGATGAGCG AGAGCATTTG TCTCAAATTA GAAGTATAGC 11460 

TAATTTTGTT ATAGATACTA CAAAGTTATC ACCTAAAGAA TTAAAAGAAC GCATTCGTCG 11520 

ATACTATGAA GATGAAGAGT TTGAAACTTT TACAATTAAT GTCACAAGTT TCGGTTTTAA 11580 

ACATGGGATT CAGATGGATG CAGATTTAGT ATTTGATGTA CGATTTTTAC CAAATCCATA 11640 

TTATGTAGTA GATTTAAGAC CTTTAACAGG ATTAGATAAA GACGTTTATA ATTATGTTAT 11700 

GAAATGGAAA GAGACGGAGA TTTTCTTTGA AAAATTAACT GATTTGTTAG ATTTTATGAT 11760 

ACCCGGGTAT AAAAAAGAAG GGAAATCTCA ATTAGTAATT GCCATCGGTT GTACGGGTGG 11820 

ACAACATCGA TCTGTAGCAT TAGCAGAACG ACTAGGTAAT TATCTAAATG AAGTATTTGA 11880 

ATATAATGTT TATGTGCATC ATAGGGACGC ACATATTGAA AGTGGCGAGA AAAAATGAGA 11940 

CAAATAAAAG TTGTACTTAT CGGTGGTGGC ACTGGCTTAT CAGTTATGGC TAGGGGATTA 12000 

AGAGAATTCC CAATTGATAT TACGGCGATT GTAACAGTTG CTGATAATGG TGGGAGTACA 12060 

GGGAAAATCa GAGATGAAAT GGATATACCA GCACCAGGAG ACATCAGAAA TGTGATTGCA 12120 

GCTTTAAGTG ATTCTGAGTC AGTTTTAAGC CAACTTTTTC AGTATCGCTT TGAAGAAAAT 12180 

CAAATTAGCG GTCACTCATT AGGTAATTTA TTAATCGCAG GTATGACTAA TATTACGAAT 1224 0 

GATTTCGGAC ATGCCATTAA AGCATTAAGT AAAATTTTAA ATATTAAAGG TAGAGTCATT 12300 

CCATCTACAA ATACAAGTGT GCAATTAAAT GCTGTTATGG AAGATGGAGA AATTGTTTTT 12360 

GGAGAAACAA ATATTCCTAA AAAACATAAA AAAATTGATC GTGTGTTTTT AGAACCTAAC 12420 

GATGTGCAAC CAATGGAAGA AGCAATCGAT GCTTTAAGGG AAGCAGATTT AATCGTTCTT 12480 
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GCGTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 12600 

GAAACAGATG GTTATAGCGT GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 12660 

CCGTTTATTG ATTATGTCAT TTGTAGTACA CAAACTTTCA ATGCTCAAGT TTTGAAAAAA 12720 

TATGAAGAAA AACATTCTAA ACCAGTTGAA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 127 BO 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 12840 

AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAGAATTAAT TAGTACTATT 12900 

CCTTTCGTAC CAAGTGATAA ACGTnAATAA TATAGAACGT AATCATATTA TGATATGATA 12960 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13020 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT TAACTAGAAT 13080 

AGACGT 13086 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT CACCGGCCTT TTATTTTAGC 60 

TAACTTTATT TCTGATTTTA CGATTTTAAT TGATCATACA GAGAAAGTGA TCTTTTTACA 120 

ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 160 

GTTAGCTTCT CTTGTTTTGG GAATGAATCA TCTTCTTTAA TCCAAATCGC TAATTCGCCT 240 

AATGGTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTCCTT 300 

CTCTCAATTT ACTTATATAA ATCCTACCAC GAAAGCTTTC AAGAAAACAC AATTAAATGT 360 

CTATTTAGTG AACTTTTTAA GGTTGTGCAC TCTTTTAATG TCTGCCAATT AGGTCAATTA 420 

ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 4 80 

CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 540 

GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTGCACC 600 

TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 660 

TCTGCCACGT ATAATGTCTG CTGCTTTTTC AGCTAACATT AAAACAGGTG CGTGTATATT 720 

GCCATTTGTC GTACGTGGCA TAGCTGATGC ATCAACTACA CGTAAATTTT CCATACCGTG 780 
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ACTACAAGAT GGGTGTAATG CTGTTTCACC ATCTCTACGA ACCCAATCAA GAATTTCTTC 900 

GTCTGTTTGC ACTTCTGGTC CTGGTGAAAT TTCTCCACCA TTGAATGGAT CCATTGCTTT 960 

5 

TTGAGATAAG ATATTTCTTG CTACACGAAT TGCTTCTACC CATTCTTTTT TATCTTCTTC 1020 

TGTTGATAAA TAATTAAAGC GGATACTTGG TTTTTCGAAT GGATCTTTAG ATTTGATTTT 1080 

CAAGCTACCA CGAGAGTTTG AATACATTGG TCCTACGTGA ACTTGATAAC CATGTGCGAC 1140 

10 

CGCTGCCTTT TGACCATCAT ATCTTACAGC TATTGGTAAG AAATGGAACA TTAAGTTAGG 1200 

ATAAtCAACT TCGTTATTTG AACGTACAAA TCCGCCACCT TCAAAATGGT TAGATGCTGC 1260 

15 TGCACCTGTA CGTGTGAAAA TCCATTGTAA ACCAATAAAT GGcATGCGCT TGAtATCTAA 1320 

GCTTGGCtGt AATGATACAG GTTCCTTACA 1350 
(2J INFORMATION FOR SEQ ID NO: 18: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 18: 



30 


TAATGCTATT 


GGCAACACCA 


TATATGAAAn 


CTCCAAACGA 


TCCTAAACCG 


ACTATAGATT 


60 


CACCAAATTT 


nACAATCCAT 


GAATAAAGTA 


GTGGCCATAA 


GAATAACAAT 


ATGACAACTA 


120 




AAAATGTACA 


GTAAAATGCA 


GTCATAATTG 


GAACTAGACG 


TTTACCACTA 


AAAAATGATA 


180 


35 


ATGCTAATGG 


TAATTCTGTT 


TCACTAAACT 


TATTGTATGC 


ATAAGCTGCT 


ATTAAACCTA 


240 


TTACAATACC 


AACAAAGACA 


TTGCCATTAT 


TCATCTTTTC 


AAAAGCTGAA 


TTTATTTCCG 


300 




ArGCTTTCAT 


TCCTAATAAA 


GGCGCTAATT 


TCATTGGTGA 


TAATACAACT 


GTAACTAAAA 


360 


40 


AATATCCTAA 


CGTrGCTGCA 


rGCGsGACTG 


CACCATCATT 


TTTCTTTGCC 


ATTCCTATAG 


420 




CTACACCAAT 


TGCAAATAAA 


ATACCTAATT 


GCTCTAAAAT 


CGTAGTACCT 


ACCGTAGTAA 


480 




AGAACATTGC 


GATTTTCGGC 


GTCGCATGAA 


GTGCATTTAA 


CGTATTACCA ATTCCGGCAA 


540 


45 


TAATTGCTGC 


AGCCGGTAAA 


ATGGCAACTG 


GTAACATTAA 


CGAACGCCCT AAATTTTGGA 


600 




AAAATTTATA 


CATTGAATGT 


CATCCTTCTT 


AAAATAATGT 


AGAAATATAA 


AGATTACTAA 


660 




TGTAACTAGA 


ATAACTACTT 


CGATACTCCG 


TTATAGTCAC 


CTAGGCTTAC 


TAACCAGCTA 


720 


50 


TATTTCTACC 


TCAAGTTATT 


TTATAAACTT 


TTTACAATTT 


CATGCAATTC 


TTGTTGTAAC 


780 




TTTGCTGTTC 


GTGTTTCAAT 


CTCTTTTGTA 


ATATAATCGA 


TACGCTCGTT 


TCGTTTTAAA 


840 
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AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT GACTTACTTA 960 

ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT GTGCGAGAAC 1020 

5 CTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG CGTTTTAAAG 1080 

TACTAGGTCC TGGATTTTCA CGATCTACAG GAAATGCATT TAAAGACGTT AAAAATTTAC 1140 

CAATCCATTT ATTTTTGAAT AATTCTTTTT TAGCCATATA ATGAATTTGA TTAGGATATA 1200 

10 

ATGCCA7ACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT ACGACATATT 1260 

TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC ATTTTAACTA 1320 

AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT TAACTT 1376 

15 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7363 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
<D) TOPOLOGY : linear 

25 ■ SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
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TTGTCATACC 


AATATTTTGT 


AAAATATGGA 


ACACAAGTAA 


AGTGACGAAA 


CCAACGATAA 


60 


AGATTTTGTT 


AAATTGATCT 


TCAATTTTCG 


CAGCTAATCT 


TATTAGATGG 


AAGATTAAAA 


120 


ATAAAAATAT 


TAAGATCAAT 


ATGACAGAAC 


CGATAAAGCC 


AAGTTCCTCT 


CCAATCACTG 


180 


AAAAGATAAA 


GTCAGTATGA 


TTTTCAGGTA 


TATAAACTTC 


ACCGTGATTG 


TATCCTTTAC 


240 


CTAGTAACTG 


TCCAGAACCG 


ATAGCTTTAA 


GTGATTCAGT 


TAAATGaTAG 


CCATCACCAC 


300 


TACTATATGT 


ATAGGGGTCA 


AGCCATGAAT 


TGATTCGTCC 


CATTTGATAC 


AGTTGGaCAC 


360 


CTAATAAATT 


TTCAATTAAT 


GCGGGTGCAT 


ATAGaATACC 


TAAAATGACT 


GTCATTGCAC 


420 


CAACaATACC 


TGTAATAAAG 


ATAGGTGCTA 


AGATACGCCA 


TGTTATACCA 


CTTACTAACA 


480 


TCACACCTGC 


AATAATAGCA 


GCTAATACTA 


ATGTAGTTCC 


TAGGTCATTT 


TGCAGTAATA 


540 


TTAAAATACT 


TGGTACTAAC 


GAGACACCAA 


TAATTTTGAA 


AAATAATAAC 


AAATCACTTT 


600 


GGAATGATTT 


ATTGAATGTG 


AATTGATTAT 


GTCTAGAAAC 


GACACGCGCT 


AATGCTAAAA 


660 


TTAAAATAAT 


TTTCATGAAT 


TCAGATGGCT 


GAATACTGAT 


AGGGCCAAAC 


GTGTACCAAC 


720 


TTTTGGCACC 


ATTGATAATA 


GGTGTAATAG 


GTGACTCAGG 


AATAACGAGC 


AAGCCTATTA 


780 


ATAATAGACA 


GATTAAGAAA 


TACAATAAAT 


ATGTATAATG 


TTTAATCTTT 


TTAGGTGAAA 


640 


TAAACATGAT 


GATACCTGCA 


AAAATTGCAC 


CTAAAATGTA 


ATAAAAAATT 


TGTCTGATAC 


900 
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TTGCTAAAAC AGCTATAGTG GCTACTAATA 

CGGCTGTTGA CGAGATGAAT AATTCATTGC 

5 

TCAATTTTAC ATGACTTTTT AAAAATTAGC 

TTCAATTTGA ATTAGGAATA AAATAGAAGG 

ATAGATACAG ACACATAAGT CCTCGTTTTT 

70 

TATTAAGATT CAAAGATGCG AATAAATCAA 
ATATTAAGGT AGCAAACCCT GATATATCAT 
TTTGTATCGT TTTTGGAGGG AAAAGTGCAG 

15 

ATGTATTAAA TGCAATAGAT AAAGACAAAT 
ATGGTGATTG GAGAAAGCAA AATAATATTA 

2 0 ATTTAGAAAA TGGAGAGGCG CTTGAGATTT 
AACCATACGA TGCAGTATTC CCATTATTAC 
AAGGGCTTTT TGAAGTTTTG GATGTACCAT 

25 GTTCTATGGA CAAACTTGTA ATGAAACAAT 
CTTATATTAG TTTCTTACGT TCTGAATATG 
TAAATGATAA ATTAAATTAC CCAGTCTTTG 

30 GTATCAGTAA ATGTAATAAT GAAGCGGAAC 
TT.GACCGTAA GCTTGTTATA GAACAAGGCG 
TAGGAAATGA CTATCCTGAA GCGACATGGC 

35 

ACGATTACAA ATCAAAATAT AAAGATGGTA 
ACGAAGATGT TCAATTAACG CTTAGAAATA 
GTTCTGGTTT AGTCCGTGCT GATTTCTTTG 

40 

AAACAAATGC AATGCCTGGA TTTACGGCTT 
TGGGCTTATC TTATCCAGAA TTGATTACAA 
AGGATAAACA GAAAAATAAA TACAAAATTG 

45 

TACATTAAAG CAAATTCAAT CATGGATTCC 
AGAGATAAAT GGAGTCACAA TTGATTCACG 
SO ATTTAAAGGT GAAAATGTTG ACGGTCATCG 
TGGGGCTGCT TTTTATCAAA GAGGGACACC 
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CCCAGTCTAC TTTGCGAAnC aATGCTTATC 1020 

AAACTCCTTT TATACTCACT AATGTTTATA 1080 

TAGAATATCA CAGTGATATC AGCTATAGAT 1140 

GAATATTGTT CTGATTATAA ATGAATCAAC 1200 

AAAATGCAAA ATAGCATTAA AATGTGATAC 1260 

TTAACAATAG GACyAAATCA ATATTAATTT 1320 

TGGAGGAAAA CGAAATGACA AAAGAAAATA 1380 

AACACGAAGT ATCGATTCTG ACAGCACAAA 1440 

ATCATGTTGA TATCATTTAT ATTACCAATG 1500 

CAGCTGAAAT TAAATCTACT GATGAGCTTC 1560 

CACAGCTATT GAAAGAAAGT AGTTCAGGAC 1620 

ATGGTCCTAA TGGTGAAGAT GGCACGATTC 1680 
ATGTAGGAAA TGGTGTATTG TCAGCTGCAA . 1740 

TATTTGAACA TCGAGGGTTA CCACAGTTAC IB 00 

AAAAATATGA ACATAACATT TTAAAATTAG 1B60 

TTAAACCTGC TAACTTAGGG TCAAGTGTAG 1920 

TTAAAGAAGG TATTAAAGAA GCATTCCAAT 1980 

TTAACGCACG TGAAATTGAA GTAGCAGTTT 2040 

CAGGTGAAGT CGTAAAAGAT GTCGCGTTTT 2100 

AGGTTCAATT ACAAATTCCA GCTGACTTAG 2160 

TGGCATTAGA GGCATTCAAA GCGACAGATT 2220 

TAACAGAAGA CAACCAAATA TATATTAATG 22 80 

TCAGTATGTA TCCAAAGTTA TGGGAAAATA 2340 

AACTTATCGA GCTTGCTAAA GAACGTCACC 2400 

ACTAACTGAG GTTGTTATTA TGATTAATGT 2460 

TTGTGAAATT GAAGATCAAT TTTTAAATCA 2520 

AGCAATTTCT AAAAATATGT TATTTATACC 25 80 

CTTTGTCTCT AAAGCATTAC AAGATGGTGC 264 0 

TATAGATGAA AATGTAAGCG GGCCTATTAT 2700 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC TAATGGTAAA ACAACGACTA AAGATATGAT 2820 

TGAAAGTGTA TTGCATACCG AATTTAAAGT TAAGAAAACG CAAGGTAATT ACAATAATGA 2880 

5 * 

AATTGGTTTA CCTTTAACTA TTTTGGAATT AGATAATGAT ACTGAAATAT CAATATTGGA 294 0 

GATGGGGATG TCAGGTTTCC ATGAAATTGA ATTTCTGTCA AACCTCGCTC AACCAGATAT 3000 

TGCAGTTATA ACTAATATTG GTGAGTCACA TATGCAAGAT TTAGGTTCGC GCGAGGGGAT 3060 

10 

TGCTAAAGCT AAATCTGAAA TTACAATAGG TCTAAAAGAT AATGGTACGT TTATATATGA 3120 

TGGCGATGAA CCATTATTGA AACCACATGT TAAAGAAGTT GAAAATGCAA AATGTATTAG 3180 

TATTGGTGTT GCTACTGATA ATGCATTAGT TTGTTCTGTT GATGATAGAG ATACTACAGG 324 0 

TATTTCATTT ACGATTAATA ATAAAGAACA TTACGATCTG CCAATATTAG GAAAGCATAA 3300 

TATGAAAAAT GCGACGATTG CCATTGCGGT TGGTCATGAA TTAGGTTTGA CATATAACAC 3360 

20 AATCTATCAA AATTTAAAAA ATGTCAGCTT AACTGGTATG CGTATGGAAC AACATACATT 3420 

AGAAAATGAT ATTACTGTGA TAAATGATGC CTATAATGCA AGTCCTACAA GTATGAGAGC 3480 

AGCTATTGAT ACACTGAGTA CTTTGACAGG GCGTCGCATT CTAATTTTAG GAGATGTTTT 3540 

25 AGAATTAGGT GAAAATAGCA AAGAAATGCA TATCGGTGTA GGTAATTATT TAGAAGAAAA 3600 

GCATATAGAT GTGTTGTATA CGTTTGGTAA TGAAGCGAAG TATATTTATG ATTCGGGCCA 3660 

GCAACATGTC GAAAAAGCAC AACACTTCAA TTCTAAAGAC GATATGATAG AAGTTTTAAT 3720 

30 

AAACGATTTA AAAGCGCATG ACCGTGTATT AGTTAAAGGA TCACGTGGTA TGAAATTAGA 3780 

AGAAGTGGTA AATGCTTTAA TTTCATAGAG ATTAGTCGAG GGACCTTTTA CTTATAAAAA 384 0 

TGATTTGAAT TAATACTAAA AGATTACAAA GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3 900 

35 

TTGCCTTTTT CTTTTTATGT TAAATCTATA AATTTGAAAC TAAATCAAGG TTAATTCTAT 3960 

GTACACACTT TATATAGGAA GTAGTTTGAA TGTTTATATA ATGTTTTACA AAAAGATGTA 4020 

GTATTATAAT GTCTAATTTC ACATGTGTTT CAGTAAAATT TGTTGTGGAA TGTTAACGAT 4080 

40 

ATACGTATTT TATAAAAaAT TTTTTATAAT GATTATTCGA ATGATGCGTA ACGCTTACAT 414 0 

CTTATCTAAT GCTAGCTTTT TGACAAAAAT ATGACAATCA ATTAATGTGA TTCTAATAAA 4200 

45 TATTCGCAAA TTGCTTTATT GCGATTAAAT TTTTTTGGTG GTACTATATA GAAGTTGATG 4260 

AAATATTAAT GAACTTATAT GCAAAAGTAT ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 4320 

TATTTTGCAA AATTTTAAAG AACTAGGGAT TTCGGATAAT ACGGTTCAGT CACTTGAATC 438 0 

50 AATGGGATTT AAAGAGCCGA CACCTATCCA AAAAGACAGT ATCCCTTATG CGTTACAAGG 444 0 

AATTGATATC CTTGGGCAAG CTCAAACCGG TACAGGTAAA ACAGGAGCAT TCGGTATTCC 4500 
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AGAATTGGCA ATGCAGGTAG CTGAACAATT AAGAGAATTT AGCCGTGGAC AAGGTGTCCA 4620 

AGTTGTTACT GTATTCGGTG GTATGCCTAT CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 4680 

CCCACAAATC GTAGTCGGAA CACCTGGGCG TGTTATCGAC CATTTAAATC GTCGCACATT 4740 

AAAAACGGAC GGAATTCATA CTTTGATTTT AGATGAAGCT GATGAAATGA TGAATATGGG 4800 

ATTCATCGAT GATATGAGAT TTATTATGGA TAAAATTCCA GCAGTACAAC GTCAAACAAT 4860 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT CCAAGCTTTA GTACAACAAT TTATGAAATC 4 920 

ACCAAAAATC ATTAAGACAA TGAATAATGA AATGTCTGAT CCACAAATCG AAGAATTCTA 4 980 

TACAATTGTT AAAGAATTAG AGAAATTTGA TACATTTACA AATTTCCTAG ATGTTCATCA 5040 

ACCTGAATTA GCAATCGTAT TCGGACGTAC AAAACGTCGT GTTGATGAAT TAACAAGTGC 5100 

TTTGATTTCT AAAGGATATA AAGCTGAAGG TTTACATGGT GATATTACAC AAGCGAAACg 5160 

TTtAGAAGTA TTanAGAAAT TTAAAAATGA CCAAATTAAT ATTTTAGTCG CTACTGATGT 5220 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT GAGTCATGTT TATAACTTTG ATATACCTCA 5280 

AGATACTGAA AGCTATACAC ACCGTATTGG TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 5340 

2S CGCTGTAACG TTTGTTAATC CAATCGAAAT GGATTATATC AGACAAATTG AAGATGCAAA 5400 

CGGTAGAAAA ATGAGTGCAy TcGTCCACCA CATCGTAAAG AAGTACTTCA AGCACGTGAA 5460 

GATGACATCA AAGAAAAAGT TGAAAACTGG ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 5520 

CGCATTTCTA CAGAGTTGTT AAATGAATAT AACGATGTTG ATTTAGTTGC TGCACTTTTA 5580 

CAAGAGTTAG TAGAAGCAAA CGATGAAGTT GAAGTTCAAT TAACTTTTGA AAAACCATTA 5640 

TCTCGCAAAG GCCGTAACGG TAAACCAAGT GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 5700 

AATCCTAAAT TTGACAGTAA GAGTAAACGT TCAAAAGGAT ACTCAAGTAA GAAGAAAAGT 5760 

ACAAAAAAAT TCGACCGTAA AGAGAAGAGC AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 5820 

ACATTTGCTG ACCATCAAAA ATAATTTATA GATTAAGAGC TTAAAGATGT AATGTCTTGA 5880 

GCTCTTTTTT GTTTTCAATA ATTGATTCTC TGTAGATATC aAAGTaCTAA CGTTTTAAAG 594 0 

GTTAAATATT TAATTGGATT GAGATCTGTA TGCGGTTATA TCaTTCTGTG TAAATATGGT 6000 

TCTCCACCAA ATGTGGTGAG TATATAATTT AAAGAACTAT TTTTAAATTA AGAATAATCG 6060 

AACATAAATA AACTTTATGA AATTTCAGTA TCATGTTCTT ATAAAAAACA ATAGGGCTTT 6120 

TTGctGACGC TAGTGCGCGA TAAATAATAA GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

SO TTCGTTTAAT GGCAGCGATC TTTTTTATTT AATTATTTCT CTTTCCACTG CAACATTTGA 624 0 

TAACCAATGC GTGGATGTGT TTTAATAATA TCTTTTGCGT CCTCATGACA TTGTGAAAGT 6300 



55 



30 



35 



40 



45 



270 



EP 0 786 519 A2 



CCATATATTC GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT 6420 

AGTTCTAACA AGCTAAATTC ATTTGGCGTC AAATGTACCT CCTGATTATT AATAACAACA 64 80 

5 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 6540 

TGTGACTTAG CGATTCTCTC GATGACTCGT ATTCGTGCCC GAAGCTCATC AACATTAAAA 6600 

GGTTTAGTCA TATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 6660 

10 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GCCTGATTTC TGAAATCAAA 6720 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 6780 

1S TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGCTGTAG TTACATTGTA ATAATCTAAA 6840 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 6900 

GATTGCATTA TACGTCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 6960 

20 CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG ATAGTCTGTA 7020 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 7080 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GGTCCTTCGT 714 0 

25 CTATAACGGC AAATTCGATT TGTTCATAGC TAGCATAACG AATAGATAAA TTGATTTTGG 7200 

TGTCAGTAGA AGTGTGTTTA ACTGCATTTT CAATCAAATT GAAtAAAgCT TGTAAAATCA 7260 

ACTTACTGTC AATGTGTATA AACtGTAAAT TTACTGAGGA TGATACAGTT ATACGCTTTT 7320 

30 

TTAAATGGCG ACGTTCTAAA ATACATATCG ATTTCTTATA CTA 7363 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 10470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

TTAACAATCG ATAACCACAA TACTTCTATT GTAATTGTTT AACGATTTCn CGATTAAAAT 60 
45 CATCTAAATC GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa CTGACTCATC 120 

AACTACATGT GCGCCTGCAT. TTGATAAATC TTTGCGTACA TTTAATACTG CTGTTAACGT 180 

ACGACCTTTT AAATCGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA TGGCAAATGT 24 0 

SO TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT CTGTATCTCC 300 

ACGTAAATGA TCTGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT CTTCTGGTTT 360 
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ATTTGCAGTA TCTCCAATCA CTACAGTATT AAAGCCTGCA TTCTCTAATG CCTCTTTAGG 4 80 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTTAGTCAT 54 0 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA ATCAGTGATT 600 

AAACATACAA GATATAAAAA ATATTAAGCG ACTGTCGCGA TATCTAACCC TAACACATCT 660 

TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAGCTGATCT AACAATCCAA 720 

TCCATTCATC TATATCTTCA ACACGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 780 

TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 84 0 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA 900 

TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 960 

AATTAACTAA AATTAAGAGA TAATTTGTTA CGAGTGATAA TACGAaGkGG TaTCATACCG 1020 

20 ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAAGCAAC 1080 

CTGTTAGCAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCATGAG 114 0 

ACAATAAACG TTGATTTAAT GCGTTTTTTT GCCTTTTTTA TTTTCCTTAT TTTTTCTGTT 1200 

25 TTACAACAAA ATGGTATCAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 1260 

AACAACCACA CTCCTAAATT AATAGGTGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 1320 

TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACGTTTA 1380 

GAATCTCTTC GGCAACTTTG CTATAGACAG TCTATGCTGT TACTAAATTA TACCACCACA 1440 

CAAACCTACT CC CATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG TCATATGAAT 1500 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CGAGTCATTT 1560 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 1620 

CAGlTGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA 1680 

CATATTGCGC TACGCCAGTT TGTTTGTGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 1740 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 1800 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT 1860 

ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 1920 

GTAAGGAGGA GCCATCAGGC TCCAAGCATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 1980 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGGAATTA TCAGTACTGC 2040 

50 CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 2100 

ACACATAATT TGTAAGTCAT CAACTAACCT ACAAATATAA TTATACTAAA CAAATGTTTA 2160 
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GTTATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AATAGCACAT 2280 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAGaCATGAC 2340 

5 

TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 24 00 

ACCATTCCAT GTTCTAATAG GCAAGTAATA ACGTTGCCCC TCCCATGTAT ATCCTACCCA 2460 

AACATGACCA TCTTGTAACA TCACTTCTGT ATAATCACAA TACCCACCAG GTTGGAACTG 2520 

10 

ATAACCCACT GGACAAGATA AGAATGGCCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 2580 

GTTTGTGAAT CTAGCACTTT CTTCCATGTA GTAAGTACCA TATTTATTAC GTTTCCATGC 2640 

15 ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC TCATTAGAGA CAGTGGCAAC 2700 

CGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 2760 

CCGCTTGTCT TCTGG.CAATA GACCGCGAGT TACTGGGTCA AAACCAGTGT GTAAAACCGA 2B20 

20 ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 2B80 

TGCTGGTAAT CCCCATTTTT TCAACAATCT AGCGCATTCT TGGAAAGTTG CCTGTTCATT 2940 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CGTAATAATA 3000 

25 TTTATTACCT ATTTGATTAG CGGTATGCCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3060 

AACTGTGTTG CCTGATACGT AACTATGCGC AATGCCCGCT TCTAATCTTG ATAAAGGTGC 3120 

ATTTACTAAT CCGTTACGAT ATGCTTCAGC AGTCGCCCCT TTGCTCCCTG CGTCGTTGTG 3180 

30 

TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAACCTT TAACCACATC 3240 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 3300 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG GACGGATAAA 3360 

35 

CCACATAGGG AAATCATAAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG 3420 

TTGTTCGATT CCGTCAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 34 80 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 3540 

40 

GAATACCACC ATGTCGCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC 3600 

TAATCCGTCG AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 3660 

45 TCCAAACAAA ACTTTCCAAC CAGCATTGGC ATAATCAAAG CATTGAAATC CATACCATAA 3720 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 3780 

TAATTTTGCT TGCATTGTCG CCACCTCCAT GATGATACTC ATTCACATCA AAGCCAACAT 384 0 

50 CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3900 

ATTCCGGCGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3960 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA GTAACGTGAG GATAGCGCCT ATAATTGCGC 4080 

TAGCTTGATT TAATTGAGTA GATAAATCTA ATCCGAATAA ATCCGTGACT TGCTTGATAA 414 0 

5 ATAGCAACAA TGCTCCAACT AAACCAGTTA GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4200 

AGTTAATATC CATTTGTTTG CTCCTTTTAT CCAAAATAAA AAAACGACTA AAAATTAGTC 4260 

GTTTAAAATT ATTCAATGGT CAATGTCGGA GATCCTGAAT AAACATCACT TATAGTGACG 4320 

10 

TACAACATCC CTGAAGGATT ACTAAAGTTG ATATTTTTAC TTGCAACTCC GCTATTGACT 4380 

CCTGATATTC CTAAATCACT TGACCCTAAA TTAGTTTGCG AAATCCTCAT TATACCGCTA 4440 

CGTACATTTT CTATTGTCAC CTGATAACTT TTATTGGGTT CAACTCCATT TATTGTCCAT 4500 

15 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA TATTTATTTT TAGGTAAGGG TTTTATTACA 4560 

AAAGATGAAG GCTTTTTCCA TACTTGGATA TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4 620 

2Q CCTTCGTAAA TAAACTTCTT TACATTTTTA AAATTACCTT CCATAAAAAT CACCCTTTAA 4680 

TTAAATATAA CGTATTCGGG TCTTTTTGAT ATATATAGTT ATATTCATTT TCTGTTCCTG 4740 

TCCAAATTTT AACCGTCGGT TGAGATGCGC TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4 800 

25 GTTTAGTAAA AGCTTGAGAT GACAAAACAT ACCGCTCGTC ATGATTATGA TTTTTTGGAG 4860 

CATATAAATC ATTTAGTGTT TGTTTGAATT CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4 920 

CAATCTGTTG CAATACACTT TCTGAAATAG AGTTGTTTTG TATTGCTTCT GCTAATTCTC 4 980 

30 TTAATGTGTT CATAGATTCA GGCGCGCTAT CAACTAGTTC AGCAATTTTT GTATCCGTAT 504 0 

ACGTTTTAGA GTCGTTGAGA GTTGTATCTT TGATTTTTTC AACTTCTTGC AATTTATTTT 5100 

CTAACCCTTC AACATTTGCG ATATTGATTT TGTCCAATAA CTCAGGTTCT GCTTTGATAT 5160 

35 

CTGTATCTTT ACCATCAATT TGCCACATTT TAGTGTCAGG ATTGATTGAT ACTACAGTAC 5220 

CGTTJTTACC GGGTGCGCCT TGTTCTCCTT TTTTACCTGC TTCACCTTTT GCTCCAGGTT 5280 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC CTTTAAATCT ACTTTCATTC TTTTCGATGT 534 0 

40 

AAGAAATGAC ATCTTTATCT ATTTTCTCTT TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 5400 

TATCTTTTAA AATTCTCGTA ATAGCATCAT CTACCAATTT AACATCGATT TCTTTTGCTA 5460 

CAGCAGATTC AATACCACTA TCAACGATAT TGAAAGAAAA GTTTGCGACA TGTATTTTTT 5520 

45 

CTTCTTCTTT CTCTAAAAAC AGCTTACAGC GAACATAACC AGCGTGTTTG ATAACCTTTT 55 BO 

TAGGTATCTT GTAGGTAAGG AAACCTTTTA CAACATCGTC GATAATAAGG GGCTCATTTT 564 0 

SO TGAATATAGA GCCATCTTCC ATAAACAAAT GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 5700 

GATCGATACG ACCTTGTTTG TCATTGATAC CTATTCTTAT AGATGCTGTA TTTTCATCTT 5760 
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CAACATCTTT TATTTTGTAC ATTTACACAC CTCTTTATTT ATATTTATCC CTTGTGAAGT 5880 

AGATACCTTT TAAGCCGATT TGTTTATATA ACTTAGCGAT TGTACTTGCT TGATGTTGGC 5940 

5 

ACCACTCTAT AGCAGTAGCG TATTGGTGGG TAGCTGGATT CTTAGGATTC CATCTAATTC 6000 

GGTACAATGT GTTTTGACCT TTATTGATGT AATCCTTTCT TACGAAGCTA GCACCGCCCA 6060 

TGATTGCTTT TGCTGGAGAT GTCCAACCTT TATTCCTTGC AAACGTCATT GCGTAGTTAG 6120 

10 

GATTGTTGTC GTAAGCGCCA ATGCCGAAGT AGTTGTATAC TCCATCTTTT CCGTTAGCGA 6180 

AGTTACTTGT TCCATATCCA CTTTCTAAGA AAGCATGCGC GATTAAATAA ATTTCATTAA 6240 

15 TGTTGTGCTT TTTACAAGCT TCTGCGAACG CTTTACCTTG ATTATTCAAT GTTCCCTTAG 6300 

CTTTAAGTAT CTTATTAAGT GCGCTAACTG AAACACCTTG ATACTTGCCT AAATTAAGCA 6360 

TTTGGTAGCA TTGTGTGTTA CTTTCCCATA TACGCTTTAC ATTCATTGCT GAACTCGTTT 6420 

20 GTGCTCGTGT AGCGTTAscC AACCCCAAGC ATTAGATTTT TTCGGGTTAC CTCTTGCCAT 6480 

TTGTTTATCC AGTGCTTGTT TGAATGTATA AGGACTCGTT TCTGTTATGA TCTGCGGTTG 6540 

TTTAGATGCC GAACCATTGT TGGCTGTTGG TGACGAGTCT CTTACATTAG CTATATCAGC 6600 

25 GTTTTTATTA TCTACCATAA CTTTTATTCT AGATTTTGTT ACTGTTGGCT TAGTTATAGA 6660 

ATTTAATAAT TTTTCTCTGT TTTTAAATAT ATTAAGTAAT GCCTTTTCTA ATGCTTCGTA 6720 

TTTATCTTTA GGAGGAACAC CGTTGTCAAT CATATTCCAA TTAACATGTT CCAACATTGA 6780 

30 

ACGCCAAATG CTGTCGTCTA CTTTTAAATT TTCAATACTT AGAGGTATCT CATATTTGGC 684 0 

CATCATATCT ACAGCTACAA CCATTGCGTG AATCTCATTA AAAATAAATT CATTTTT AC T 6900 

CGCACTATAA TCTTCACATA CGTCTATAAC TATATAATCA GGTTCATTAG GAACTTCAAA 6960 

35 

TACAGCTCTT CTAGGTGCCC AAATATTATG TCTATCAACA TAAAAGTGGG GATATTCTAC 7020 

ATCCTGTTTG TATTTCTTCC TACTGTTATA TAAACTTTCT ACCGAGCTCA TCGTTTGTGC 7080 

GTTTCTAATC ATTATTCCTT TAGGTTTTTC GAGTCGTCGA TTACCTTCTA CTATAAAGTG 714 0 

40 

ATAAATATAT TCTGGATAAT TAACCTCTTG GCTAGAAATA GTGTACTTTA TAGTTGTTAC 7200 

ATCTTTCCAA ATTGGAACTT TTTTATTATT TTTTTCGTTA TCATCACTAT CATCTTCTGG 7260 

45 TTTAGGTGCC GGTGTAGTTT TGTCTGGATG ATATGGTGGT CTAACAAAAT ATTTAACCCC 7320 

TCCACCTGGT CCATCATGAT AAGAGTGTTT AATTTTATAA GGTGGACTTC CTGTTGCGTT 73 BO 

ATTTGTATAC CAGTTTTGAT CTACGCCATA CCAATAGTCT TTTGTGCATG GTCCCACTAC 744 0 

50 AATGTTTACA TGTCCTGCCC AACCACCAGT CCAAACACCC CAGTCGCCTG GTTGTGGTAC 7500 

AAAATCTTTT GTATTTCTAA TTATCTTGAA ATCTCTACCT CTATAATTGG ATTTTTGAGC 7560 

55 



275 



EP 0 786 519 A2 



TAAATCCCAG CATTGTGCTC CCATTCCAGA ACCAGGTACA TCAATAGCTA TTTTGTTTTT 7680 

AGCGATATAT AACGCCCATT CAACCACTTC ACTAGCTGTG GGCTTTCTAT TTTTCGGATT 7740 

5 AGGTAATCCC ATGTATGCAC CTCATTTCAA TCAAAATAAA AAGCCAGTGC CGAAGCACTG 7800 

ACTCTTAACT GTTATTTACA TTTACCAAAC CAGAAGCACG CCCAGAAGCT ATATCCTAAA 7860 

ATCCCTTTAA GCATGGTAAT CACCTCCTTT AAATACCAAA AACAGTTCTT AGTAAAGCTA 7 920 

10 

TGACAATCGT ACTGAAGATA GTCCCTATCA AACCTAGAAT CCACATTTTT ATGTCTCTAA 7980 

TATTCTTGGC ATTCTTTTCT TTATTCTTTT CATCTTCTAC CTTGTCGCGC TTTAATTCTT 8040 

CAAAATTTCT ATCTAATTTG TCATAAATCT T TT CTTGCGC TCTAAGACTA TCTTCTATTC 8100 

15 

TGTCGAATTT TTCAAACATA GTCTTATCAT TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 8160 

GTTCATGTCG TTTGGTAAAT CCAAACATTA TGCCACCCAC TTTATTCAAA TTAAAAAGCC 8220 

ACAAGCATTA CACCTGTGAC TTTTCATCTT TTGTTTCTGG ATATTTTTCT CCAGTGATTA 8280 

20 

AAGCGTATTC TTCTTTATCG ATTAAACCCT TGTCTACGTA CCACTTAATT TGCTCGTTTT 8340 

TATAGTAACC CCAAACATAA AAAGTTTTAA TGTCTTTAAA AGTTGGATAA ATCATCTTCA 8400 

2S TTATTTAAAC GTCCCCCTCA GTACTTGTTT TGTTAGTTTT CAGTTCAGTC AACTGTTGTG 8460 

TTAACATAGC GTTTTGTTGA GCTAATTCCA TTGTTAATAC GTTTACTTGT GCCACCTGCA 8520 

TTTGCATACT CGCAACCATT CCGCGAAGTT CCTCATCACT TAAATCTGAC GCACTTTGTT 8580 

30 GGTTTGATGC ATTCGGTACG TCTTCTTTTT CGAAATTGCT ATTGTATTTA ATTTCGCCGT 8640 

TAGTGAAAAC AAACTTTCTA GGTTCGAACT CTTCTTTAAA TTTAATAGGC ACATTGTTAT 8700 

CATCTACATC TAAACTATTG CGTAAACCGC CAGTATTAAC GAATCCGATA ACTTCGTTTT 8760 

35 TATCGTTTAC TGTGATTTTC ATTATTTCCA CCCCATAATT TTAGTTATAG TAACTTTGTT 8820 

GGCAJTCGCT CCAGAACCTG ATGTTTTACC TAAATCAAAG TACACATCGT TATCTATTCT 8880 

TAAAGTAGTG CTACTTGTTT TGGATAGTAA GCACTCATAA ATACCGCCAC CGTTGCCGTC 8 94 0 

40 

TGAGTCAACT ACATTCGCTT TACTCAATTG AATCGCGTTA GGTAATGCGG TTAGTCCGAA 9000 

TCCCTCAATA ACGCCACCTG GATAAGTTCC ACTTACCAAC AAAATAGAAT AGTTTGTGTA 9060 

CGGTTCAGTT AGATTGATTG TTGTACCTAC ACCATTTGCG CCACCGTCGA ACAATACCGT 9120 

45 

TGATTTATGT TCATTAGGAA CTGTCCACTG TTGCTCAAGT CTGCCGTTTG TGATTGATCG 9180 

TGTGTAAATC TTTTTAGAGT TATAAGGTGT GAAGTTAAAT AGCTTGTTTG TATCATCTTT 9240 

AACGAATACC GATAAATAAC CCTCATAACT TTCAACGCTA CCTGGTAAAT CCGGCACTCT 9300 

50 

TGTTGCATAG TAATTACCAG CAGTTAAATA TCCCAAATCG CCTTGCGCAT TATTTAAGTT 9360 

55 
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GAATTTATCA 


TCTACATACT 


GCTTAGCTTG 


ATTTAAAGCG 


TTGTTAGACG 


TTTCTTCAAC 


9480 




AAATTGCTTA 


GTTAAGTTTC 


CATCATTCTT 


TTTATAAAAC 


GGGTACCATG 


TGCCGTAGAT 


9540 


5 


TTTGTATTTT 


GTG TACT CAT 


CGTTTGAATC 


GTCTGGGTAC 


CATGTTGCAC 


GAGCAGTATT 


9600 




ATTATCAACA 


ACATAAACAA 


CTAACACACC 


AGATTTGCTT 


GATGTATAAG 


TTGATTCATC 


9660 


10 


GAACGAAGAA 


CCGTCATCAA 


CACCATCTTG 


TCCAGGCTTC 


TCTAACGTGC 


CTATATCCGT 


9720 


CTTTTCTGGC 


GCATCTGTTG 


CATTAGTAAT 


ATGAATAATC 


CTAGATGTGT 


TAACTGCGCT 


9780 




TAAAACGCTA 


TCTATGGACT 


GCTCATACGA 


TTCAATTGCT 


TTACCGTAAT 


CATCTGTAAG 


9840 


15 


TTTAGACTTT 


TGCCAATTCG 


TTGTTGAATT ACCTTTAACA AGGTCAGCGC 


CATTGATTTG 


9900 




TTGTTCAACT 


TCGTTAACAC 


GTTCAAAAAT 


CGCTTGCTCT 


TTTTCAACTA 


TTTTATCGAA 


9960 




TTCAGCTGTA 


ACAGCTTGTG 


TTGCACTAGT 


TTGCGTCGCA 


GTAATAGCTT 


GTATAGCTTC 


10020 


20 


GTTTTGCTTG 


ATTTCGATTT 


GTTGAATGCC 


TTTTGTCGCA 


CTATCATTCA 


CTTTTGCTAT 


10080 




TAACGTTTGT 


GTATCAGCCA 


TATTTTGCTT 


TAATTGGTTA 


AAATCTTTAC 


CGACAGCTTC 


10140 




GATAGTATCT 


TGAATAGATT 


TGATATAAAC 


AAGCTTTGTT 


ATACCATCAA 


ACCCACTAAC 


10200 


25 


TAAATCATTT 


TCAATATTGA 


AGCTAAATTG 


ACGTTCAACA 


ACAACATTAT 


TACTCCCGTT 


10260 




TTGTGTAAAG 


AATGCCTGAG 


CATGCACCTT 


GCCTGAATGT 


TTTAAAAATT 


CATTCGGTAT 


10320 




CACATACTGC 


AAACGCCCAT 


TAATTGCGTC 


TACTATCGTT 


AATTCGTCTG 


AAATATAAGC 


10380 


30 


GCCTCTATCT 
ACTTATAGAT 


ACGTTATAAT 
AAGGGTCTGT 


CATCGGTTTT 
TATnCTTAGT 


TAAnACGATA 


GATGTTTTAA 


CATGTTCAGA 


10440 
10470 



(2) INFORMATION FOR SEQ ID NO: 21: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3647 base pairs 
- (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

45 ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 60 
CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 120 
AAGATTTATC GAAATGATTC AGTATTTCAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 180 

50 AGTTGTGCCA TCAAGAATTT ACAAATATGC GcATCATGCT AGTCAGCATT TAAATCAACT 24 0 

TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTCCATATA TGTATTATCT 300 

55 
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TCAAATTGTA AGAACTAATC CTATTGCAGG 
AGATAATGAG AATATGAAAC AACTACTTAA 
5 GCTAGTTGAT TTAGGACGTA ATGATATTCA 

TACTAAATTA ATGGTTATTG AAAAATATGA 
AGGTAAAATA AATCAAAATT TATCGCCAAT 

10 

TACCGTTTCA GGTGCACCAA AATTACGTGC 
TAAACGGGGC GTTTATAGTG GTGGTGTTGG 
TGCATTAGCA ATTCGAACGA TGATGATAGA 

15 

TGGCGTTGTA TATGATTCTA TTCCTGAAAA 
AAGCTTATTG GAGGTGAGCC CATGATCTTA 
AACCTAGTGG ATATTGTTGC TCAACATACT 

20 

AATGTGCTGA ATCAAT CGGT GGACGCTGTT 
GACGATCAAC AGTTAATGAA AATCATATCA 

25 TGTTTAGGGG CTCAGGCACT GACTTGTTAC 
GTTATGCACG GCAAAGTTGA TACACTAAAG 
CAAGATATAC CAGAACAGTT TTCAATTATG 

30 AATTTTCCAG AAGAATTGAA AATTACTGGA 
CATAAAGAAA GACCGCATTA TGGTATTCAG 
GGTGTCAAAA TAATTACAAA TTTCATTAAT 

35 TACTAACAAG AATAAAAACT GAAACTATAT 
ATATfiCTTAT TTCTCCTAGT ATTGGAACTG 
CGGAGCGAGA AATCCAACAA CAAGAATTAA 

40 

TGTATCCACA TCAACCATGT TATGAAGGGG 
AGTCAAATAG TTTCAACATT TCAACGACTG 
AAGTTATAAA ACATGGtAAT AAAAGTATTA 

45 

ATCAAATGAA CATACAAaCA ACAACTGTTG 
ACCTTGTATT CATTGGTGCA aCTGAATCAT 
5Q GAAAAATGAT TGGAAAGCCT ACAATATTAA 
ACTTAACGTA TCAAATGGTA GGCGTCTTTG 

55 



TACGATTCAA CGTGGTGAGA CGACACAAAT 420 

TGATCCAAAA GAATGCAGCG AACATCGTAT 4 80 

TAGAGTAAGT AAAATCGGTA CCTCAAAAAT 54 0 

ACATGTTATG CATATCGTAA GTGAAGTCAC 6 00 

GACAGTTATT GCGAATTTAT TACCAACAGG 660 

AATTGAAAGA ATATATGAAC AATATCCACA 720 

ATACATAAAT TGTAATCATA ACTTAGATTT 780 

TGAGCAGTAT ATCAACGTAG AAGCTGGTTG 840 

AGAACTGAAT GAAACGAAAT TGAAAGCTAA 900 

GTTGTAGATA ATTATGATTC CTTTACATAT 960 

GACGTCATTG TTCAATACCC TGATGATGAT 1020 

ATTATATCTC CTGGTCCAGG GCATCCATTA 1080 

ACCTATCAAC ACAAACCCAT TTTAGGTATT 1140 

TACGGTGGAG AAGTCATTAA AGGCGACAAG 1200 

GTTATATCGC ATCATCAACA TCTGTTATAT 1260 

AGATATCATT CATTAATAAG TAACCCTGAC 1320 

CGTACCAAAG ATTGTATACA GTCATTCGAG 1380 

TACCATCCTG AATCATTTGC TACAGACTAT 144 0 

CTAGTGAAGG AAGGATGAAA ACCATGACAT 1500 

TACTTGAAAG CGACATTAAA GAGCTAATCG 1560 

ATATTAAATA TGAATTACTT AGTTCCTATT 162 0 

CATATATTGT ACGTAGCTTA ATTAATACAA 16 8 0 

CTATGTGTGT GTGCGGCACA GGTGGTGACA 174 0 

TTGCTTTTGT TGTAGCAAGT GCTGGcGTAA 1800 

CCTCaAATTC aGGTAGTACG GATTTGtTAA 186 0 

ATGATACACC TAACCAATTA AATGAnAAAG 1920 

ATCCAATCAT GAAGTATATG CAACCAGTTA 1980 

ACCTTGTGGG TCCATTAATT AATCCATATC 2040 

ATCCTACAAA GTTAAAGTTA GTTGCTAAAA 2100 
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10 



15 



20 



AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 2220 

ATTACACATT AAATGCGACT GATTATGGTT TGAAACATGC GCCGAATAGT GATTTTAAAG 2280 

GCGGTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGATCAGT 234 0 

CAAGTCGACG TGATGTTGTC TTACTAAATG CGGGTTTAAG CCTTTATGTT GCAGAGAAAr 2400 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAGCAT 2460 

TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTATCAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 2580 

GTGAAGATTC AGAATAAAAA ATCTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 2640 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 2700 

TCGCAACAAA TCTCAGATTA TGACCAATAT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2760 

GAAAAGTACT TTGGTGGTAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATTA 2 820 

CCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTGATGTTGC TAAACAAGCT 2880 

GGTGCATCTA TGATTTTATT GATCGTTAAC ATCTTATCTG ATAAACAATT GAAAGATTTA 2940 

25 TATAACTACG CTATATCGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCGCCATGAA 3000 

TTAGAACGTG CCTATAAGGT TAATGCTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3060 

CGATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

30 TATTATATTT CTGAAAGTGG TATTCACGAT GCATCTGATG TAAGAAAAAT CTTGCATAGT 3180 

GGTATCGATG GCTTACTAAT AGGTGAGGCG CTTATGCGTT GTGACAATCT ATCTGAATTT 3240 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 3300 

CATCAATAAA GGATGTTACA GCGGCCAGTC AATTACCTAT TGATGCGATA GGTTTCATCC 3360 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTCTGCTG 3420 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA ACAATTGAAC 34 80 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgGCACAGAA TCTATTGATT 3540 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT CACTAAAGCT TTAGCTGCaG 36 00 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3647 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5966 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



35 



40 



45 



SO 



55 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 





CcAcCTTGAC 


CACCTTTACG 


TGGAATCTTT 


TCmCCTkGAG 


CAACaTCGaT 


AATaTATATT 


60 


5 


GAAAgTCAAC 


AAGTTCTGGA CTAAATGTTG CTGCTAAGTT ATCGCCACCA GATTCTATGA 


120 




AAATTAGTTC 


TATATCGTCA 


TGACGTTCTA 


ATAATTCGTC 


TATTGCTGCA AAGTTCATAG 


180 


10 


ATGCATCTTC 


ACGAATCGCA 


GTATGAGGAC 


ATCCACCAGT 


TTCAACACCA 


ATGATACGAC 


240 


TTTCAGGTAG 


AACTCCTGAA TTTACTAATA TCTTTTCGTC TTCTTTTGTA TATATATCAT 


300 




TTGTAATAAC 


GCCGATACTC 


ATTTCTTTTG 


AAAGACGTTT 


TACAACTTTT 


1 Uuii Inn 1 * 


360 


15 


GTGTTTTACC 


TGCACCTACA 


GGACCACCAA 


TACCAATTTT 


Tk. ft 'ft ^P^P^^ 

AAJXliOAI 1 X 


pp C 21 p Zl 21 TT 21 
uv.Lnl<nnl in 


420 




TAACCTCCTA 


TGATATGAAA 


tTCTAACATT 


GaCGTTCTCA 


TGCGCCATTT 


\jt\l 1 Inul JIV_ 


480 




TAAACCAGGC 


GCTGTCATGC 


CAAAATCTGC 


TTCTTTTAAT 


TCGAAAATCT 


GCTTTCTTGT 


540 


20 


TCCTTCTATA 


TAAGGAATCA 


TGTGAGTAAC 


TATCTTTTGA 


CCAGCAGTTT 


GTCCAAGTGG 


600 




AATAG CACGA 


ACAGCATTTT 


GAGTTAAACT 


TGAAACATTT 


TGATATAAAT 


AGTAATCAAT 


660 




AATCGTTTCA ATATCTACAC 


CTAAATGATG 


GCCTAGCATA 


GTAAAACAAA 


TAGCTGGATT 


720 


25 


TnACTTTGCT 


TTCTTATCTT 


GCATTTGTTG 


ATGATACCAA 


GCAATCCATG 


GGCTATtATA 


780 




AAGTTCTAAA 


GCCAATTTAA 


CCATGCGAGT 


CCCCATTTGT 


kTTGCACCAA 


CACGTGTTTC 


840 


30 


TTTAGGTAAG 


TTTTGrACAr 


ACATCAGTTT 


ATCTATGTGT 


AATACTTTTT 


GTGTATCATC 


900 


ATTTTCCAAT 


GCATCATAAA 


CTAaACGCAT 


GGCTAAACCA 


TCAGAATAGG 


TAAGTTGCTC 


960 




TTGTAAAAAC 


ATTTTTAACC 


AAGCAATAAA 


AGTATGATCG 


TCATGAATTA 


TATTTCGTTG 


1020 


35 


AATATATGTT 


TCAAGACCAA 


ATGAATGACT 


GAAAGCACCT 


GTTGGAAACT 


GTGAATCACA 


1080 




GAACTGAAAT 


AATCTTAAGT 


GTGTATGATC AATCATGAGA ATGCCCTATA TGTCTGAAAG 


1140 




CCTTATTAAC 


TTTACGGTCT 


TCTCGAACAT 


ATGGGATGCC 


TAAACTTTTT 


AATAAATCTT 


1200 


40 


CAACTAAATA 


ATCATATTGT 


ACTAGCATTT 


CAGTCTCTGT 


AAATTGTGCT 


GGCAAATGAC 


1260 




GATTTCCTAA 


TTGATGGGCT 


ATATCTCCCA 


TTTCTTGCAA 


TGTTCTTGGT 


TGAATCACTA 


1320 


45 


AAAGATCTTC 


TGAATTAACA 


TCCACAATAA 


TCATATTATG 


GTCATCTGCG 


TATAAAATAT 


1380 


CTCCATATTG 


TAAGTCAATA 


GGTTGTTTTA 


AACGAATGCC 


TATTTCAGTG 


CCATGGTCTG 


1440 




TAACGACTCT 


TTGAATACGT 


TTAACAAGAT 


CTGAATTTTC 


AAGGTATACT 


TTTTCGACGT 


1500 


50 


GCTTTTGTTT 


TTCTGAATTT 


GACAAATTGG 


CAATATTGCC 


TTGGATTTCT 


TCAACAATCA 


1560 




TTCTATGTTC 


CTCCTAGAAT 


AAGAAGTATC 


TTTGAGTTAA 


TGGTAACTCA 


GTTGCTGCAT 


1620 




TACTTGTAAT 


TTTTTCTCCA 


TCTACATATA 


CTTCATATGT 


TTGTGGATCA 


ACGTCTAATT 


1680 
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JO 



15 



20 



25 



30 



35 



40 



45 



50 



55 



GACGCACCAT 


GCGTTTTAAA 


TTTAATGCAC 


GATTGATACC 


ATTTTCATAA 


GCAGTTTTAG 


1800 


ACACGAATGT 


CATTGACGTA 


CTTGTAAGGT 


TTCCGCCGTA 


TTGACCATAC 'ATTTTACGGT 


1860 


ACTTCATCGG 


TTCAGATGTA 


GGTATAGAAC 


CATTTGCATC 


GCCATTTACG 


GCAGAGTTAA 


1920 


TTAATCCGCC 


CTTTACAACT 


AATTCAGGTT 


TAACCCCAAA 


GAAAATTGGG 


TCCCATAAGA 


1980 


CAATGTCAGC 


TAGTTTGCCC 


GGCTCGATAG 


ATCCTACATA 


TTCAGAAATA 


CCATGTGTAA 


2040 


TTGCTGGGTT 


AATTGTATAT 


TTAGCGATAT 


AACGTTTGAT 


GCGATTATTA 


TCATTATGTT 


2100 


CAAAATCACC 


ATCTAAAGGA 


CCACGTTGTT 


CTTTCATGCG 


ATGTGCTACT 


TGCCATGTTC 


2160 


GTGTAATTAC 


TTCACCTACA 


CGGCCCATTG 


CTTGTGAATC 


GGAACTAATC ATACTGAATA 


2220 


CACCCATATC 


TTGCAGAACA 


TCTTCTGCTG 


CAATCGTTTC 


TTTACGAATA 


CGTGAATCTG 


2280 


CGAATGCGAT 


ATCTTCAGGA 


ATAGCCGCAT 


TTAAATGGTG 


AGTAATCATT 


ACCATATCTA 


2340 


AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAGAT GAAGGTAAAA 


2400 


TATTTGAAAA 


TGCAGCGGAT 


TTAATTAAAT 


CAGGCGCATG 


ACCGCCACCA 


GCACCTTCAG 


2460 


TATGGTACAT 


ATGAAGTACA 


CGGTCTTTAA 


CAGCAGCCAT 


TGTGTCTTCC 


ATAAATCCTG 


2520 


CTTCATTTAA 


AGTATCTGCA 


TGTAATGCAA 


TTTGAACATC 


AAATTCATCA 


GCAACATCTA 


2580 


ATGCATGACT 


CAAAGCAGAT 


GGTGTTGCAC 


CCCAGTCTTC 


ATGTACTTTT 


AATCCAATTG 


2640 


CTCCGGCATT 


GATTTGTTCA 


ATGAGTGCAG 


TTGGATTTGT 


TGCTTGTCCT 


TTACCTGTAA 


2700 


AACCGACATT 


AATCGGTAAA 


CcTTCGGCAG 


CTTCTAACAT 


TCTATGAATA 


TGCCATGGAC 


2760 


CTGGAGTTAC 


AGTTGTTGCT 


TTAGAACCTT 


CTGAAGCACC 


AGTACCACCA 


CCAATATGAG 


2820 


TCGTAATACC 


ACTTTCTAAT 


GCGACCTCTG 


CTTGTTCAGG 


ATTAATAAAA 


TGAACATGAG 


2880 


TATCAATACC 


ACCAGCAGTG 


ACGATTTTAC 


CTTCAGCGGC 


AATGATATCT 


GTTGTTGAAC 


2940 


CTATAATAAT 


GTCGACATTA 


TCCATTATAT 


CTGGGTTGCC 


GGCATTACCT 


ATGGCGAAAA 


3000 


TATAACCATT 


TTTAATGCCT 


ATATCAGCTT 


TAACCACTTT 


ATCGTAATCG 


ATAATAACGG 


3060 


CATTAGAAAT 


GACAAGGTCT 


GCAACGTTCA 


CGTCATCACG 


TGTTACACGA 


GGATTTTGCG 


3120 


CCATACCGTC 


TCTAATAGAT 


TTACCACCAC 


CAAAAGTAGC 


TTCTTCACCA 


TAAACCGCAT 


3180 


AGTCTTTTTC 


TATTTGAGCA AATAGATTCG 


TATCACCTAA 


ACGAATGGAA 


TCTCCAACAG 


3240 


TTGGACCGTA 


TAAGCTCGTA 


TATTGATTTT 


GCGTCATTTT 


AAAGCTCATG 


ATCTTTTTCC 


3300 


TCCTTTTTTA 


TTCACGTTTT 


CAGCACCGTT 


ATCTCCGAAT 


ACACCTGCAT 


ATTCATCATT 


3360 


TTCATCAGTT 


GGGCGATAGA 


CACGTGACTC 


ATCGATAGGA 


CCATTGACCA 


TACCACGAAA 


3420 


ACCAAAAATT 


TTACGTTTGC 


CAGCATATTC 


AACTAATTGA ACTTCTTTTT 


TATCCCCAGG 


3480 



281 



EP 0 786 519 A2 



TTCGAAATCT AATGCTGCAT TTGCTTCATA AAAATGAAAA TGTGAGCCCA CTTGAATTGG 3600 

TCGATCTCCT GTATTTTCAA CTTCGATAAC TGTTTCAGGA TGATGGTTAT TAATTTCAAC 3660 

5 

CTCTGTACTT TTTGTAATAA TTTCTCCTGG TATCATTTGA CTGCCTCCTT TAAACAATAG 3720 

GGTGATGTAC TGTGATTAAC TTAGTACCAT CGGGGAACGT AGCCTCGATT TCGATATCTG 3780 

TAATCATGTG TTCGACACCA TCCATGACAT CTTCTTTGTT TAGAATTTGT CTACCATAAC 384 0 

10 

TCATTAACTC TGCAACGGTC TTACCATCGC GTGCACCTTC TAATAATTCA TCGCTGATTA 3900 

AAGCTAATGC CTCAGGATGA TTTAGTTTCA AACCACGTGC TTTACGACGA CGTGCAACTT 3960 

15 CCGCCGCCAC TACAATCATT AATTTGTCTT GCTCTCGTTG TGTAAAATGC AAATTAAAAC 4 020 

CCCCAATTTC ATATTAGATA CaATTTACAA AATTTATATT AATCCTAATT GTTGTGATAA 4 080 

ACAAGTAATA TACAAAGTTC AATGTGTAAT TAGAAAATTA TATTTTTAGC ATATCCGATA 4140 

20 TTGAAGCAAA CAATCTAATC GAAAACAAAT AGTGGAATAT ATTTATGTAA AAACCAAAAT 4200 

AGTTTTTAAT ATAACTTTTC ATAGAATAGT AGTATATTAA TGAGTAATGA TTCAAAGGAA 4260 

AGGTGAAAGA TTTGAAGATA ATAGATGTGC TTTTGAAAAA TATATCTCAG GTTGTGTTAA 4320 

25 

TTAGTAATAA ATGGACAGGA TTATTTATCT TAATAGGATT ATTTGTAGCC GATTGGACAA 43 80 

TTGGATTAGC GGCTATTGTA GGTAGCATCA TCGCCTATAC TTTTGCGCGT TTTATAAATT 4440 

30 ATAGTGAGGC AGAGATTAAT GATGGGTTAG CTGGATTTAA TCCAGTGCTA ACTGCCATTG 4500 

CGTTAACAAT CTTTTTAGAT AAGTCAGGAT TAGATATTGT TATAACAATG ATAGCAACTT 4560 
TATTAACGTT ACCAGTTGCT GCTGCAGTGA GAGAAGTTTT AAGACCATAT AAAGTTCCGA . 4620 

35 TGCTGACGAT GCCTTTTGTC ATTGTGACTT GGTTTACAAT TTTACTTTCA GGACAGGTTA 4680 

AATTTGTAGA TACATCGTTA AAGTTAATGC CTCAAAACAT TGAAACGGTT AATTTTAGCA 474 0 

ACAATGATAG AATaCATTTC ATTCAGTCAT TATTTGAAGG ATTCAGTCAA GTATTTATCG 4 800 

40 

AAGCGAGTGT AATTGGTGGC GTATGTATTT TAATCGGCAT ATTGATAGCA TCAAGAAAAG 4860 

CAACACTCTT AGCTGTTATA GCTAGTTTGT TAAGCTTTAT CATTGTAGCT CTATTAGGTG 4 920 

GTAATTATGA TGATATTAAT CAGGGATTAT TCGGTTATAA CTTTGTATTA ATGGCAATCG 4980 

45 

CACTAGGATA TACATTTAAA ACAGCGATTA ACCCTTATAT TTCGACTTTT TTAGGTGTGT 5040 

TATTAACAGT AGTGGTGCAA CTAGGTACAA CAACATTGCT TGAACCGTTT GGCTTACCTG 5100 

50 CATTAACATT GCCATTTATT ATCGTGACAT GGATTTTATT ATTTGCTGGT ATTAAACATG 5160 

ACAAAGTAGA TGCTTGATAG TTAAATCAAA CCTAATATTG TTTGAATATC ACCTTAAACT 5220 

ATACAGCGAA TTGTATAGTT TAAGGTGTAT TTTTATGGAT AAAATTAAGT GCATACTTAA 5280 

55 
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GTGTTAAACT AGGAATAAAT AATTTATATT GTGTGTTGTG TGGGGTGACT AATATGAATG 54 00 

ATATGGATAA TTCCTTTTTA ATAACAACGG AAATTCAAAG AAAATGGATT GAAAAATTCA 5460 

5 AAGTAATTAG AGATACATTT AAGGCTAAAG CTGAATATAA TGATCAACAT AGCCAATTTC 5520 

CATATAAAAA TATTGAATGG TTAATTAAAG AAGGTTATGG AAAATTAACG TTACCAAAAG 5580 

CATATGGTGG TGAAGGTGCG ACCATAGAAG ACATGGTTAT TTTGCAATCA TTTTTAGGCG 5640 

AACTTGATGG TGCCACAGCA TTATCTATTG GTTGGCATGT GAGTGTCGTA GGACAAATTT 5700 

ATGAACAGAA ATTATGGTCT CAAGATATGT TGGAGCAATT TGCTGTTGAA ATTAATAATG 5760 

15 GTGCATTAGT TAATAGAGCA GTTAGTGAAG CTGAAATGGG TAGTCCAACA AGAGGGGGAA 5820 

GACCAAGTAC ACATGCTGTT AAAGCTGATG ATGGGTATAT TTTAAATGGT GTGAAGACAT 5880 

ATACATCAAT GAGTAAAGCA CTAACACATA TTATTGTTGC TGCTTATATA GAAGAATTAG 5940 

20 AAAGTGTTGG TTTTTTCTTA GTAGAC 5966 
(2) INFORMATION FOR SEQ ID NO: 23: 

j 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 17310 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: , 

TTTGAACGTA TTTCATCAAA 60 

GCTATAATAC AAAATAGACC 120 

TTTTTGCTAT TACCTAACTT 180 

GCATATTTAA CTGTAGATGC 24 0 

AATaGCaAAC AArGCGTAAT 300 

AAATGATTTG CCTTTAATAA 360 

GTGATAAAGT CAGCTATAGA 420 

ACCGCTAAAC CGATAAGTAA 4 80 

AACGTCTTCC ACTTCTTTAA 540 

AGATTGATTA CTCATTTTGA 600 

AATAGTTTCG GGTTGGTTGT 660 

ACCGCCAACA TGACTGGCTA 720 



CTGTGTCATC GCGAAATAGT 
ATATAACAAT TTCATTAGTA 
TATAGTCACA CTGCTTATAA 
AAAGfiTGATC ATCCCTAAAT 
AAGAACTTCC TTAACCGTAA 
AATCATACGA TATGTATACA 
ATGGTTAGCG AAAAACAGTA 
AACATTCACA CCGGCAATAA 
CAATGTTAGT AATTTACTAT 
TCATTTTCTC CTCAGTAAAA 
TGTAATCACT GTCTATTAAA 
TCATCATACA TATACCATTA 



TAGGGTCATT CATTAATCCT 
AAGGGGACTT GTTCAAACCA 
TATAAGAGGT AACGATCACT 
AGAAATAAAT GACTACAAAT 
TAAATATCAA ATCATCAAAA 
AAATAATGAm AAACTGTmAA 
AATAAACTAA TATTAGTAAT 
CCGAAGATTG CTGAATAAAA 
TGTGTTGATT TTCCATTATA 
CATTCTAAAT AACGTTTTCT 
TATTTTTCCA GGACTTTAGC 
TCAGCTACTA ATTCTGAAAT 
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TTATTAAAAT AAACGTATCG TATTGTGATA 
AAAATGTGAC ATCATTTTCT AACCCAGCTT 

5 

TAGGTCCATC GCCTATAAAT GTAAAATGCG 
CTATTGCCGC GATTAGATTT TGTGGCAATT 
ATTGATGCTT TGTCGGGGCA TTAATCTGTA 
CTTCGCCAAT ATTGTTATGT GATTGGCTTT 
CAATACCATT ATGTATTGTG GTTAATTTCA 

15 GTTTATCGAA ATCTGAAACA CAAATAATGC 
TAACTAAATA TAGAAATTTT TTAGCTGGTT 
CAGTAAAAAC TATACGTGTG TCTTTCGATT 

20 TtCCAGCTTT GGAAGAATGT AAATGGATAA 
CTAACACTTT GACAGCTAAA ATATCTTGTT 
TAATAATTAC ATTAACTCTT GCATCTAGTT 

25 

TGACATAAAC ATCATTGTGT ACGCAAAAAT 
CACCACCATT GTCTGCTTTA GTAATACAAT 

3Q AATGCTATAC TTTCAATTTC TTAACATGGC 
ATGCAAATCA ATGATGGCAC ATATTTCTTA 
TCGTTATACT GTAACAATTG GTCACAATCT 

35 AGTAAATTAA GACTACCTTG AGCCTTCCCC 
TATATATAGT TCCATCATTA AACTACCTTT 
TTGTTGCGGT GTTAAGTCAT ATCCACCTTG 

40 

AACAAGACAT CTTTGCTCGA AACCTATCAC 
ACGTTCCGGG CGTGGTCCAA TAAAACTCAT 
TAATTCATCA ATGCGTGTTT TACGAATAAA 

45 

.TTTATCAGCC CATTGCGCAC CGTTTTTCTC 
TATTTTAATT AATTTACCCA TCTTCCCAAC 
50 CGAATCTATG ACGATAGCAA TGGCGAATAT 
AACAATGCTT AAAATTAAGT CAATCGCACG 
TTCTAGTTTG TCTAATTTTC TTTGATAGGC 

55 



ATAAATGACT CGCATTAATG ACATTGCCCA 840 

GTACAACTTG TTGCTGACAA TCATTTAATG 900 

CATGATTACT GTTATGTAAT TTCAATATCT 960 

TTGGATAAGC AAATCTTGCA ATCATAACAA 1020 

AATCTTGTTT ATTAGGCAAC ATTCCAACTA 1080 

TTAGCGTTTG CTTAACAGCG GGAACATCTG 1140 

ATCGATTAAA TCGATATTTT AACGCTAACT 1200 

TATCTGTAAT AAGTGACATT AATTTTTCGA 1260 

TAACACCCTC TGTAAAAGCC CATCCATGTG 1320 

TCGAAATGAa CTtCGCAATT CGTCcGACCG 1380 

CATCAGGTTT AATTTTCGAG AATAACTGTG 1440 

TAAAGTCAAT TGGACCTACT AAATGTTCGA 1500 

GTTCAATCAT TGGTCCATGA TTGCCTACAA 1560 

GGTTGGCGAG TTGAATGAGA TGTGTTTGTG 1620 

ATATAATTTT CAACTGTTAC AAACCCCTTT 1680 

TATCTCATCA GATGAATAGT ATTTATAGCC 1740 

ATGCCATTTG ATACTGTCTC AAGGGATTCC 1800 

TTAAAATATA ACTTTTATTT GAACTTATTA 1860 

TGTAATAACA ACCATCAATG TTCTAATTGA 1920 

ATGTATATAT TTCATGTCAT ATTTCAGTTT 1980 

AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 2040 

TTCTGAACTA AATAATTCTA CAAATTCCGG 2100 

TTCCCCTTTA ACAACATTAA TTAGTTGTGG 2160 

CTTCCCGACA TTTGTTATAC GATCATCATC 2220 

TGCGTTTTTG CACATCGAAC GTAATTTGTA 2280 

TCTAACCTGA CTATAAATAG GGTTTCCTGG 2340 

AACCATAATC GGTAAAGTTA AAAATAATAA 24 00 

TTTAATTGGG TAATAGCTTT TTCTCACTTC 24 60 

ATAACCCTTA TTATTATGGA CAGCTTCAAT 2520 
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AATTAAAGTA ATCCTTTAAA CCTGTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 2640 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 2700 

5 

CATGCTCGAC TGATTTTCCA TATAATTCAC CAATAATACG ATAAACCTCT AATAAATTAG 2760 

TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT TCCATAATTA 2820 

AGCGTACAGA TTGAACZAACA TCATATACAT ATACAAAATC TCTAGTTTGC AGTCCGTCAC 2B80 

10 

CAAAAAATGT AAATGGCTTG TTATGCTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 2940 

AATATTGTGA CTTAGGATCC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3000 

15 TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCGCCGTAA TATTTATCTA 3060 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 3180 

20 

TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 3240 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAAGATAA TCAAATTGAT 3300 

ATGTCTTCAT GATTTGTTCA ACTGCATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 3360 

25 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT ATAGTTATCT AGAACATAAA 3420 

CATCATAATC TTGTTGTAAA TCATCTACTA AATGCGACCC AATAAAACCA GCCCCACCAG 3480 

3Q TTATCAAAAC TCTTTCCAAA TCTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 3540 

CATAAAGTAT TGTAAGCTTT TTATCGATAT TTTTTATTTA TAAAAATAAA ATGAGATAAC 3600 

TTTGTGAATT TTTATTGAGA TAAATTAGAT AGTGGTGTTT TTGTGATGTT TTATAATATC 3660 

35 TTGGGTGTGT TAATACTAAT AATGCTTTCA ACTGATGCAT TAGACTGTGA CATCATAACT 3720 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 3840 

40 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3900 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3 960 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTGCTTCTTC 4020 

45 

GGCTGTTGCC ACTGCGATAC TTTCCATCAT TGGTGTTTGA ACGATACCAG GTGCGAATGC 4080 

ATTCACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 4140 

SO TGCGAATTTT GTACTGCAAT ATAAAGACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4200 

TGTTGCATTG ATAATTTTAC CGCCATGATT GAATTTTTTA AATTGTTCAT GTGCGGCTTG 4260 

AATACCCCAT AGCACACCTG CAACGTTCAC GCCATATACT GTTTTAAACT GTTCTTCAGT 432 0 

55 
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GCCAAATTGC GCGGCAGTTT GTCTTAcTGC GTTAAATACA TCATCACGGT TTGATACATC 4 440 

TGCTTTGATA GCAATAGCTT TTGTACCATC ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4500 

5 

CCCTTCTTCA TTGAAATCAA CAACTGCTAC TTTGAAACCA TCTTCCACTA AACGTTCTGC 4560 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC AGTTACTAAT GCTACTTTGT TGTTTGTCAT 4 620 

AAAGATCACT CCTCAAATTT CTTTCCTTTA ATTACATTTT ACTCCTCTTC ATTTGAATAG 4 680 

10 

TACAACAAAG GTAGCTCCAT TTAACAAAAT ATTCAGATAT TTAAGGTATA GTTAAACGCA 4 740 

CTACCATTAG TGATTGGCAA TGCGTTTAAA TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4 BOO 

1S TATTTTTTTA AGTCTCTCGA TTAGTTTGTC ATCAATCTTT TTTCGAGACA TGGTCTTTTG 4 860 

ATTCAATAGG CGGTTCCGTG TTATCACTGA CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4 920 

TTTCTTCGTT AAATCCTTCA AGGTTTTTAG TCGTGGGATT TTTAACCTCA GGATGTTCCA 4980 

on 

TCATGTCTTG ACTATCAAGT TCCTTTTTAC ACGTGTCTTT ATGTGATGCT TGATTTGCGT 5040 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT CTGCTGCAGC TACTAATTTT TTTCTACCTA 5100 

AAATAGATAT GGCTGAAACA AACCAGAGTA TTGCAGATAC AAAGTTGCAT AATACTAAAG 5160 

25 

CGATAATAGC ■ CAATACAATT AATATGACAC CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 5220 

ATGCCAATAC GATGACAGGT ACGATTGAAA GTATAATTAC AAATATAGAA ATTATTGCCG 5280 

3Q ATATAACTAT TGTTACTATT AAATAATCAG CTCTGCTACC TGATAATAAA TAGAAAAGGC 534 0 

CGAAAATTAG TCCATAGCAA ATTACAAACC CACATAAAGT TATAGCCATG AGTAGTATAT 5400 

AAGCTATTTG AAAATATAAA CCTATCTTTA TGAATGATTT TTCTACATTT TTTTCCATGT 5460 

35 CTATTCCCCA TTTATTTAAA ATTTATACTT TACCTTAAAT ATTCTCTTTA TTCTTTAGTG 5520 

ATTTTATCTT TAGATTCAAA TTGATTCTCT GTACTTTCAA TATCAACTTT TTCATTTTCG 5580 

TCTGTCGATT CATCTTTTGA GTATTTATTC CAAATCAGCA AAATACCACC AATCAGCCAT 5640 

40 

AAAATTGACG AAAGGAAATT ATATAAACAC AGTGCAATAA TAGCATAAAC AATAAAAAGT 5700 

GCACCTCCGA TTACAGAGTA ACTTTCCATA TAAATCGCAG TAAAGATGGT TGGTAAAACA 576 0 

GTGAAAAGAG CCAATATTAA TCCTAATAAA AAAATTGTTT CGTAATCAGA TCCTCCAGCA 5820 

45 

ATATTAATAG ATATCATCCT AACAAAAACG ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 

ATCCAAATTG CTATTTTTCC TATAATTGAG CTCATACTCA TTCCCCATTT ATTTAAAATT 594 0 

SO TATACTTTAC CTTAATATAC CTTATTTTAT TTAATTTTTA TATGCAAAAT ACAAAAATGG 6000 

AGAACTTCAA TATTTATAAA ATATCAAAAG TTCTCCACAC TATATTGTTT TATTATATTT 6060 

TCGCTATCAA TACGCTAAAT CATCATATTT CCCTCAACAT CACAGTAAAA CTATTGCTCC 6120 

55 
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TTCCAATTGC GCAGTTGTTC AACATCATCA TCTTGTTTAA GTAATGCCAG TGGTACTTGA 6240 

AGATTAAGAC ATCGTCCTGA AATATTAAAG CGTGTCACAC CTGCTGGCAC AGTTTCCCCT 6300 

TTATGAACAA CCGCTTCAAT TTCCTTATAA CTCAATGGCT GATACTTCAT GAGTACATCT 6360 

TGTTGAGAAA GACAAGGATA TGTACCTTGT GCAATTCTCT CTACAGAACA ACAACCACTA 6420 

TAACTTGCGA CAACCTTTTC CCATACTTGA AAATGTGCTT CGCCTAAATC TTTTGTATAC 6480 

AAATATTGTT CTGTATCACC ATGACACATT GTAATAAATG GCGCTTCTTG TCTTGTCTCA 654 0 

GTAGTCCATG GCAAGCGATG TTCTTGTTGT AACGTTTCCC ACCACACACC AAATGGAACT 6600 

15 TTATGTTGCC ATGTACTAAT TGAATATTGT GTTTCATGGA TTTCTTGCAC TGGAACTTTC 6660 

TTACATCCTA ACGCTTTCAA ACTTGTATAC CGATGCACAC CATCTATAAC CATATATCTA 6720 

CCATGTTGCA TCGCTGTCAC TAAAATAGGA TGACGTATAA AATCATCTGC TTCAATACTA 6780 

CTTTTCGTTT TTTCCAATCT TAAAGGTTCG AATGTTTCGT GAAGATCAAT CTTATCTACT 684 0 

GGTACCAATT TTAAATGTTC ATGAATATGA TTCAATAGTT ATTCATCCTC CTTTGTTTGT 6900 

GTTAAATAAA TAAATTCAGG ATGTGGATGG CTTAAGAAAT CGTGATGTGA AATAGACCAT 6960 

CCGTATGCAC CTGCATATTT GAAAACAATA ACGTCGCCTG TACTGATTGC GTCTATCTGT 7020 

ACTTCTCTAG CAAAGACATC TTTCGGTGTA CATAATTGAC CGACTAACGT TGTGTCCTGT 7080 

CTCGAAATTG AAACTTTTTC AAATGAATAT GGATTGTCCT TATAGCGATA AATGTCAAAA 7140 

GGATGGTTAT GTTGCCAAGA TACCGGCAGT CTAAATTGTT GCGTACCTCC TCTTAATATG 7200 

GCATACCAAG CACCATGTAC TTTCTTAATG TCTAGCACTT CTGTCACATA GTAACCAATA 7260 

35 TGTGCCACAA TAAAGCGCCC ACATTCAAAG TTCAATGTCA CATCTTCCAT TTCTTGCTCA 7320 

ACGATAAGTG TTTTAAAACG TTCTACAAAA TTATCCCATT CAAATTGGTT AGTTAAATCT 7380 

GCATAGTTAA CGCCTATGCC ACCACCAAGA TTGATATGTT TGAGTGGAAA TCGATGTTTT 7440 

40 - 

TCAGACCATG CCTTTGCTTT TTTAAAATAA AGTTTCACTA CATCGACATG TAAATTCGAG 7500 

TCTAAATTGT TAGAAATAGA ATGAAAATGA AATCCATCTA GATGAATCTT TGGCATTGCG 7560 

AGCGCAgcTT cAATGACATC ATCAACTTCG TCTTCAGAAA TACCAAATTG TGTTGGGCGT 7620 

CCTGCCATAT GCAACGTTGC ATTGGGAAAT GGTCCTGCTA AATTAACACG CAATAAAATG 7680 

TGTTGTGTCT TATCTTCATC TTCTAAGATG GCATTTAGCC GTTGTAATTC ATGCATACTT 7740 

TCAACATGAA TACGCTGAAC ACCTTCACTT ACTGCATATC TTAGTTCCTC GTCTGTCTTA 7800 

CCAGGGCCAC CAAAAATAAT ATGATTTGCT GGTTTAAAAG CAAGACCTTT TGCTATTTCA 7860 

CCTTGAGATG CAACTTCGAA TCCTTCAACA TACTGACTAA TTGTATCTAG GATTTTTCGT 7920 

55 
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TGTTGCAAAT GATGTTCCAG TCCGACTAAA TCATAGATAT AATGACAAAC TGGATGAGAT 8040 

TGTGCTTTTA ATTGTTCAAT AACAGGTTGA ACTATACGCA TTAGCCTTCA TCCCCTTTCT 8100 

GTTTAGACGT CGCTAGAGAT GCACTTAAAT GGCGATATAT TTTTCCGCGA TCATCACCTA 8160 

AAATAAATGT TTGTACACCT TGTGCCTGCC ATTTTGCAAT ATCTTCATCT TCACGTGGTA 8220 

ATGCACAAAA ATGTTTACCA TGTGCATTCA CAACTTCAAA AATATGTTGA ACATGTGATG 8280 

TTACTTGATC ATCACGCGTT TGCCATGGTA TGCCAAGTGA CTGCGATAAA TCTGCGGCAC 8340 

CTTCGACTAT CATGTCTAAA CCTTCGACTT GTGCTATATC GTCAATGGCC ATAACCCCTT 8400 

75 CAACATCTTC TATCATGGCA ATCACCATAA TATGCTCATT AGCCATCTCC ATTGCATCAA 8460 

GTAATGGTGT ACGTCCAAAT CTTGCCATGC GACCACCATT CAAACTTCTT AATCCTTGCG 8520 

GGTAATAACG ACTTAATTTC ACAATATGCT CAACTGTCTC ACGATCTTTA ACGTGTGGCA 8580 

20 CAATAATACC TCTCGCACCC ATATCCAACA CTTTAATGAT ATCTCTATCT ATCACTGCAG 8640 

TGACACGTAC AATTGGTATA ATATGCGCTG CTTCAGCTGC ACGAATTAAA TGCGCTAGTG 8700 

TCTCATCATT AATCGCCACG TGTTCTGTAT CAATCACAAC AAAGTCATAC CCGCTTGCTG 8760 

CGATAACCTC GATCATCAAT GGGTCCGGTA TAGAATTAAA AATGCCATAA ACTGAATCAC 8820 

CATTGTTTAA TCTATGTTTC AGAGATAGTT GTTGCATCAT TGATACCTCC TACACCTAAT 8880 

GGATTTGTAA CATGATGAAT TCTTAACTCG GAGTCACTTA ATAATCGACG TGTCGTTAAC 8940 

TTTTCAACTT GAATCGTAGG TTCAAACAAA TCGAAATGTT GATAGTTATT CAACTCTGGA 9000 

AATGCTTCTT GATACGCCTC GATGATGCCT TTAACCCATT GCCATTGCAG CTCCTCATCG 9060 

ATACCATATT GCTTTTCAAT AAATAAGATG ATTTCGGCGA TATTAATAAA GAAAAATGCA 9120 

TCATGTAAAA AGTCGCGTAC TAAACGTTCG TCATCTGTTT CAATAAATGA ATTACTATTC 9180 

ACTTTTTTAT GTGCTTCTGG CATTGGCTTT AATGTCAGGT GTGAAGCAGC TTCACTTAAA 9240 

TGctCACGCT TAAAACGAAC ACCATCATGG AAATCTTTTA AGGCAATACG TGTAGGCCAA 9300 

CCATTTTCAT GAATGAGCAT CATATTTTGT GCATGCGATT CAAAGGCAAT ACCGTGATAA . 9360 

TAAAGCATAT GAATCATTGG ACGAATCGCT ACAGCTAAAA ATTGCTTTGT CCAAGCTTCA 9420 

GAACCATATT GTTTAATCCA ATTTTCAATG AATGGTACAC CATCCTTATC ACTTGCATAA 9480 

AGTGCATTAA ATGGTATCGC ATCCTCTTCA TCGATTAACA TATGATATAT ATTTTCACGC 954 0 

50 CATATAACAC CTAACGCACC ATAAACTTGA GTTTGTTTAT AAGGCGAAAG TTGTGTATTT 9600 

AAATAAGACT GTCCTAAGAC TTCCCCTAGA AAAACTGTCT TTAATTCATC TTTTAAATAC 9660 

ATATCTTGTT GCTGTATCTG CTTTAACCAA TCCGTAATTT GCGCTGCATT TTCAATTGTA 9720 
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TATTTTGTCG 
TCACTTTCCC 
5 ATGACATGTT 
CCAGATGCTT 
■ AACATTTCGT 

10 

GCTAACCACT 
• AACGTAAATC 

15 AATTCATAGT 
CTTTGCGTAT 
TCATTTTTAG 

20 TCTGCCTCAT 
TGTGTCTTTT 
ACACCGTCTT 

25 AGTTGGTGCA 
CTCCTTGTTA 
ATTCAATTTA 

30 

TACACAGTCA 
CGCTACAAGT 

3S ATGTCCTAAG 
ATATACAACA 
CGTT^TAGAT 

40 TTTTAATTCA 
CAGCTTTAAT 
TAAACCACTT 

45 

AAATAATGAC 
AACAATCATG 
ATTCGGTAAC 

SO 

CTCATCTTTG 
GTCATTCGTA 



TGTCTATTGG CGACATCGTA CGAATCGATT 
CTAACCATAG TACTGTGCCA TTAAGCCTTT 
CAAACTGCCA TGGGTGTACA GGTATCATCT 
CAATTTGCTG TACAAAATGT TCATAAGTCT 
TAACTACAAC ATTTCTTGAT ACCGTCGTTT 
GCAGTTTAAC GTTTGGTACA AAATCAGGAC 
CTAAACGTGA TTTGTAACTT GGATGATACT 
CGTTAAATGT CTCAGGTGTT GCTGGTGGGT 
CTTTTAATTC TGTCTGTAAT AACTCGACAA 
GAAATGTAAA TACAACCTCT CTCAATAATT 
CTCCTACGAC ACGCTCAATT GGTGATGTGA 
CAGCAGTAAA ACGATACTCT GAATCATGTC 
GATATGACGC TTTATACACA ACAATATTCT 
TCACTCTAGT CTTTACACGA TTAAGAATTG 
TGACAAATTG GATTTGGTAT ATGTGTATAA 
CTCATCAAAT TCGCTTTAGC CGcAATGGTC 
ACAAATACTG CGTTATTCGC GTATTCTTTT 
TGCCATAACA CAACTTCATT TCTAGTCGCT 
TGATTTACTA CAACGTAATA TTTAAGACGA 
GGGCTTGATG CTGCCACAAC ATTTGGCACA 
AGACAAATGC CTTCAAGATC TCTGACAAAG 
ATTAATGTAT TTTGTACATG TGCTTCTAGA 
ATCGGCAATA ATGTACGATT CAAATAACAT 
TGCTCAATCA CTTGTGATAA CTTAGACATC 
GCCAATACAT GAATATCTTT ATCAGCATGG 
GCACTATTTG TTAATAAATC CATTTCAGGT 
AATGCACGAT ATCCTTCTTC AAACATCAAT 
ACTGATGCGA TAACTTGCGC GGCATCAATT 
CGTATAAAAT TAGTGATTTT AACGTGTATC 



GTTGAGGGTG ATATAGCTCA 984 0 

CTTCAGCCAA ATCAACTTGG 9900 

CAACATCATT TACATGTTTG 9960 

TATCGCCAAC TTGTTGACGT 10020 

CTACTTTATC TTTGTCGATA 10080 

CAAATTTCAA ATTATCACTC 1014 0 

GATGCCCTTC CATCGCATAA 10200 

TTGATTCTCG ATACTGCATA 10260 

TAAATTGTTC TAGCTTTTCA 10320 

GTGTATAGTC TGTTGTTGTA 10380 

TACGTATACG ATCAAAGCTA 10440 

CTTCTATTGT AAAATGACCG 10500 

CATAAATAAG TGATGATACC 10560 

TTTGATTCAC AATACGATAC 10620 

ATAGGGTTTG CACCACAATC 10680 

GGCGTTTGAT ATAAATCTTC 10740 

TTCCAAGTCA TAAGACGATG 10800 

TTACCAATAG TTGATACTAA 10860 

TGCCATGCTT CATCATGTGC 10920 

AGCTGTTTTT CAGTAGCAAT 10980 

CATACGTCGG GTATGCCATC 11040 

CTAATGCCTG TGTTACTAAA 11100 

TCAAGCCATG CTTCTGGTGC 11160 

GGTGAATCAG GCATCGTTTC 11220 

TAATTCGGTA TCCCTTCACG 11280 

TCAACTGTTT GCCCTAATGG 11340 

TTAAAATGGG GTGTTTCAAC 11400 

GTCCGTTCAA TCTGTTCAAG 11460 

GGTAATTTTA AATAAATGTT 11520 
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GCCAAGGTCT 
CACATTGATT 
ATCTATGTCT 
GCGCGTGAGC 
TTCGGGTGCA 
ATGAAATGGA 
GTCTATTGTG 
CATAATTTGC 
TTGCAAAATA 
AAGCCATACC 
CGTTAAAGTT 
ATATAAATTT 
ATGTTGTATT 
TGTGATCGTT 
ATACTACGCC 
CACTaAGACT 
TGCCTTTAAG 
CACTATATGT 
AACCTTGCAG 
CATATGATTT 
ATGCGGACTG 
AAGCAAGTGG 
CGATAATAAA 
CTTTATTAAT 
CACCGAAAAT 
CTAATATCGA 
AACl'Ti GCAG 
ACGCACTTGA 
ACTGTAATGG 



TTTATTAAAC 
TGATAAGGAT 
GCTAATTGAT 
AGAACATCTT 
TATTTCTCTA 
TGACCTAAGT 
TTACTTTGCA 
GCCATATGTT 
CGCGCAATTG 
TCTGGATGAT 
TCGAGCTCTG 
TCTTCTCTAA 
AATTCTTTAT 
GATTTGATTA 
CATAACGATA 
GCCAATAATT 
TTGTTGATGA 
TAATCCTTGA 
TATCGCACTA 
ATCATTAAAG 
TAAAAATCCA 
TGATAATGCA 
TCGACATGTT 
ATTTGGTGTT 
ACAGACAATA 
AGCTGTAACA 
TCTTCCTAAT 
TGCATCAACA 
TGTCGTACAC 



CTTGTTCACT 
GTGTTGGTAA 
ACAACACTTT 
GATGCACAGC 
AATCTGCTTC 
ATAAAGATTG 
AATAACGTGC 
GTTGCACTGC 
CTTCTTTATA 
ACATATGATG 
ATAATTGTAT 
AATATTCATT 
TTTGCACTTT 
GTGATGGTTG 
AACGTAGTAG 
TGACCAACAA 
CACGCATTCA 
AGTATTCTTG 
CAACCACATG 
CGTCCCCATA 
ATCACACTAC 
GTTAGCATGC 
TGTTGTGTGC 
TGTGATTTTG 
AAAGTAATAA 
CCGCCAATTA 
ACCTTTCCAC 
ACACCACCAA 
AATGCCATTA 



ATATTGCATA 
TAAAATAAAA 
CTCAACCTGA 
TAAATAATGC 
TGAAAACCCA 
TTCTGAAACG 
CGTGCGATGA 
CGTTTGATTA 
AGTTGTTATT 
CCCCATCGCA 
AGACCATTGA 
TAAAATGCGT 
TTTGTTTCAA 
AACAAATTAA 
CTGGTGTAGT 
CTAACATACT 
CGACAACAAA 
CAGCCATTAA 
CAATCGTGGC 
AAGGCGCGCT 
GGTCATCTAT 
CATACATAGC 
ATAATAGACA 
GCATATGTGT 
CGGCAATACT 
ATGGCCCCAC 
GATCTTCAGC 
ATAGTCCCTG 
AAAATAAGCA 



TACTGTGGAT 
TCTTTGGGTA 
TCTTCTTTAC 
AATTGGAATG 
CTTGCACTCT 
ATATAACGAT 
ATGCTATTAT 
TCTGCACTTT 
TTTTTACTTT 
GACCAATAGC 
TGATTTTGAG 
TCGATAGCCG 
CTCCCATAAT 
AAATAAACTA 
ATAACTTGTA 
GTTCGTCGTT 
CATGACACTT 
AAACTCTATA 
AAATATATAT 
TAATATCGAA 
CGCTGTATGA 
AAAGTTTGCT 
TTGAAATGAA 
CGTTTCAATC 
CATCAGTAAC 
AAGAGACCCT 
TGGCGCCTCT 
CAATAACCTC 
TACCGCCAAA 



GCTGTCGCAA 

TCTCTGATAT 

CTTCTACATA 

ATGTATGACA 

TAGGAGTCGG 

CCTCTACGTA 

CGATGTCAGA 

GAGCCATATG 

TTCCATCGAT 

GAAATTCACC 

GTGGTACTTG 

CATACGCTGC 

TTCATTAATG 

CTTACTGCAA 

ATGGCAGCGC 

CCAACAAATG 

TGAATCAATG 

TTCGTCGCTA 

ACTGATTTAA 

GCCGTCCAAA 

TTCACTGATG 

AAAACGCCAA 

CGGCGAATAC 

AATTTTAATG 

GCACTAAAAC 

GCGCTGACTG 

GCACTCGCAA 

ACAAGTACAA 

CCAAGTAACG 



11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 
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CtATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 13440 

CGACTGATAG ATTTTGTAGT GATGCCATAT AAATTGGCAA TAATGGCACA AGTACTGTCA 13500 

5 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13620 

10 CCTCCACCAC TATCACGTTG ATATAGCAAT GGTAATAAAA TTTGTTTGAA TGGCCACGTC 13680 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13800 

15 AATAAATCTT GAATTGCATC AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA. 13860 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGTGT ATCTTCTCTG 13920 

ACGACATACT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13980 

20 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAACAATATA 14040 

TTTTGACGAT GCAACTCTGG TAATGCGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

25 AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 14160 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 14220 

TGTTGCGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 14280 

30 CCTAATTGAT CTTTGAAAAT ATCATTATCT TGACCCATAT ATGACCACCA AGCTGTTTCA 14340 

TCACAAACCA TGACATACTT AGCTAGTGCT TCATCTTTTT CTATAAGCTG ACGTAATAAT . 14400 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAGCCTTAA TGCGCCTAAT 14460 

35 

GACTGCATTG CAAATGGTAC TTTGACATGG TTATACGGTG CGCCAATATC AATTAATGAA 14520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14580 

TCACTAATCT CTTTCGCAAA GACGTTCGGC AGAATATGCT GATATTGCCA AGGATGTACC 14640 

40 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14760 

4S ATATGTGTCT TTTTAACTGC TGCAACCATT AAAGGAAATG ATTGATTTAA TTCAGCTTGA 14 820 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAGCTAATGG ATGAAATGGA 14880 

CGATCTTTTA AACTTGCAAA CTGCTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14940 

SO TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 15000 

ACTTCCCATT CATGACTTAG ATCACAATTC ATATTAGCAA TTGTTTGCCA AAATTCAGCT 15060 

GCCGTTAAAG GTTGCTTAGA CACCCTTCCC TCTATCGTAA TTGGTTGTGA ACTTTCGTAA 15120 
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5 



10 



15 



20 



25 



30 



35 



40 



45 



TATATCAAAA 


GCGTTTGTCC 


GTTTTCTTTA 


GTAATCTCAC 


TATTCGATAC 


AATTCCGGCT 


15240 


ATATCTTCAA 


ATAATAATGC 


ATCAACTAAA 


TCTCTTAATA 


TTATCGCTTG 


TGCTGTATTG 


15300 


ACTGCTGTAT 


GATTCTGCAA 


TGTTCAGACA 


CCTCGCATTC 


TTAATATAGG 


TTCAATGTTG 


15360 


TCCCAATATT 


TTGTTGTTGT 


GCCTGTTGAT 


AAATAAAATA 


AGCACTTGAA 


ATATCTTCGA 


15420 


TAGCCATACC 


CATCGGATTA 


AGTAATATGA 


TCTCATCATC 


GTCTTCACGT 


CCTGGTATGT 


15480 


CACCTGTCAC 


AAGTTGTCCT 


AGTTCAGCAT 


GAAGAGCTTC 


TTTGCTGAAT 


TTACCTTCTA 


15540 


ACACCAATTG 


GTTAATAGTT 


TTCTTTTCTC 


GATTACATTG 


TGACCAGTCA 


TCTACTACGA 


15600 


CTTTGTCAGC 


TTTAATAAAG 


ACTTCTTTAT 


GCACATCCAT 


GATAGAAATG 


TTGCTAATAA 


15660 


ATGCACCCTT 


TTGTAACCAA 


TCATATTCAA 


TGTATGGTTG 


ATCCGTTACG 


GTACATGTAA 


15720 


TGACTACTTC 


ACCATTTGAT 


ACTGCTTCTT 


TAGCATTTTC 


TGTCGCAATA 


AAATTAATTT 


15780 


CCGGACGCTG 


TTGTTGCCAT 


CTATCAACAA 


AGCGTGCACA 


TGCTTCAGAG 


AATTGATCGT 


15840 


AAACAAACAC 


GCGTTCAATA 


TGATCGAATT 


GCTCTAACAT 


ACTTTGTAAT 


TGCTTGTCTC 


15900 


CGATTAGCCC 


GCATCCAATG 


ATTGTTAAGT 


CTTTAAATCC 


TTTTTTAGCC 


AAATGCTTTG 


15960 


CTGCAATCAC 


TGAAACTGCT 


GCAGTACGCA 


TACTACTAAT 


TAAACTTGCT 


TCCATAACTG 


16020 


CAATTGGATA 


ATTCGTTTCT 


GGATCATTCA 


AAATAATGAC 


GCCACTTGCA 


CGCTCCATAT 


16080 


TACGTTTCGA 


TGGATTGTCG 


TGCTTACTAC 


CTATCCACTT 


AATACCTGAA 


ATTGCGTGTT 


16140 


CACCACCGAT 


ATGACTTGGC 


ATTGCAATAA 


TTCGATCTGC 


GATGTGTCCA 


TTTTCAGGAT 


16200 


CCtGTCTTAA 


ATACGGCTTA 


AGCGGTTGTA 


CAAAATCATT 


GTGCGCATGG 


GCTGTTAATG 


16260 


CTTCTGTTAA 


TGCGTCCACA 


TAAACTTGTG 


AATGATTACC 


TCCCGCTTGT 


TCAATATCTG 


16320 


ATCTATTTAA 


ATACAACATC 


TCTCTatTCa 


TTCTGaTTTA 


ACTCCTTGTC 


TTGATTTCAT 


16380 


TTTTTCTAAC 


CATGTATCTG 


AATAAACTAA 


ATCTAAGTAA 


CGATCGCCTC 


GATCTGGTAA 


16440 


AATCGTGACA 


ATTGTTGCAC 


CTTCTTCAAT 


TGACGTTATC 


AACTGCTCAA 


TCGCTGCAAT 


16500 


AATCGAACCT 


GTTGAACCTC 


CGGCAAATAT 


GCCTTCATAA 


TCAATCAGTT 


TTCGACAGCC 


16560 


CAAAGCAGAT 


TGATAATCAT 


CTACATGGAT 


CACTTGATTA 


ATTTCTGATC 


TATTCAATAT 


16620 


TTCGGGTACA 


CGAGTAGCAC 


CGATACCAGG 


TAATTCTCTA 


TTAATAGGTT 


TGTCACCAAA 


16680 


AATGACTGAC 


CCTTTCGCAT 


CAACAGCAAC 


AATTTGTGCG 


TTTGGATGCA 


CTTCTTTTAT 


16740 


TTTTCTACTC 


ATACCCATAA 


TGCTACCTGT 


CGTGCTGACT 


GGCGCGACAA 


AATAATCTAT 


16800 


AGGTTGCTTA 


ATTGTTTCAA 


CAATCTCTGT 


GCCTGCACCA 


TGATAATGGG 


ATTGCCAATT 


16860 


TAACTCATTC 


GCATATTGAT 


TAATCCAATA 


TGCATCGTCA 


ATAGTGGCTA 


ACAGTTCTTG 


16920 
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TACATTGGCA CCATAACTTT TAATAATTTT CAAATTTGTT GGTGATATTT TAGGATCAAC 17040 

AACACACGTG AGTTTTAATC CCTTGATTTT AGCTATCATT GCCAACGCAA TGCCTAAATT 17100 

5 

ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT TAATACCATG 17160 
TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA ■ 17220 

1Q CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TGAACCATAG GTGTTTGCCC 17280 

TACAGAATCT AACAATGAAT CGTGCACATG 17310 
(2) INFORMATION FOR SEQ ID NO: 24: 

15 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5423 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
ATACTAGTAA GCGCATCGGT TATTGACATC GAATTCAACT TTAACAGTTT TCATGTTCGG 60 

25 

TGATGTTTCa ATAGAATGTG TGTGTTGTAC TTGCGCATTT ATATTTCCAC CTAAATTACT 120 
TAAGTTTCCT GTAATACTAG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 180 

30 TACTTTATCT GCAACATTAG AAACATTACG GATAACTTTA CTTGAATGAT TATCTATACC 240 
TTTAACGAAA CCTAACATTG AATACATACC AACATCCATG AATTCACGTG AAGGTGAGTG 300 
AATACCTAGC GCTCTTTTGG CTGCATTTAA AGCACCTTTT GCTACACTAG CTGCTTTTTC 360 

35 AGCTAAGTCT CTAGCCATAT TACCAATACC TCTCATCAAA CCACGGATCA TATCAGCACC 420 
TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 480 

ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 540 

40 

CGTACtTGTt ATAGTAGATA CCCaTnGCAT ACCTTTAGTG ACmATGAAGT TCCAAGCTTG 600 

AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 660 

AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 720 

45 

ACTCCATGCC GCTTGTAACG CAGTAGATAT AGCTGTAGTG ATAGCGTTCC AAACCTTAGT 780 

TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTCCAAATAG CGTTCCAAAT 84 0 

50 TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 900 

CCAAATCGTA GTAGCGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 960 

AGCTGTCCAT ATCGTCATAA CTATTGTCAT TATCGTCGTG AAAACAGTTG TAATGATTGT 1020 
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ATAAGCGACT 


ATTTGATTCC 


AAACAATCAT TATAAAATTG 


TAAACATTCG 


ATACTGCTGT 


1140 


5 


AGTGATAGCT 


GTTAAAATAG 


CATTCCATAC AACCGAAGCT 


ACAGCTTTTA 


ATACATTCCA 


1200 




AACATTAACC 


ATAAACGTTT 


TTATCGCATT CCAAGCATTT ATAATAAAGT 


TTCTGAATCC 


1260 




TTCATTTTTA 


TTCCACAATA 


AAACGAATAT AGCTATTAAT 


GCAGCAATTA 


CACCAATTAC 


1320 


10 


TATTGTTATT 


GGACCGCCTA AAATACCAAA CACAGTTACT 


AGTCCTGTGA 


TAG CATTTCT 


1380 




AATTAATCCA 


ATCTTACCGA 


ATAACAATTG GAATATAACT 


GATATAATTT 


TTAATGGTCC 


1440 




TTTTAATAAC 


ATGAACGCAC 


CTTTTAAAAT TGTTAATCCC 


GCTCTTAATA 


AACCGAACTT 


1500 


15 


ACTTACTAAT 


GCAATGrTTC 


TACCTATTAA TCCGCCACCC 


ATAAAGTTAG 


ATACAGCAAG 


1560 




AATAATCGGT 


ATTAAAAATC 


TAAATGCACC AACTAAAGTT 


ATAATGACAC 


CAACTAATTG 


1620 




TGCTGTAGCT 


GGATGCGCCT 


CAAACAAGTT AGCTATCCAA 


CCAGTTATTG 


CAACTGCAAC 


1680 


GCGTAATACT 


GCACTAGCTA 


TAGGAGCCAT TGCTGTTGCG 


AATGCArmTA ATCCTCTTGC 


1740 




GATGTTTCCA 


ATCAATTGCA 


TTATTAGTGG TCCATTTGTT 


TGTATATAAC 


TGACAAAGTC 


1800 


25 


TTTAAACCCT 


TGAGATTGTC 


CTACTTGTTC AGACCATTCC 


CTAAACTTAG 


CTGTCATTTG 


1860 




TTCAAGAGAT 


TGGAATATGC 


CAGTTGATGA TCCGCTGAAT 


GCATTCATCA 


AATTGTTAAT 


1920 




TCCAACGAAA ACATTTTTGA AAATATTACC AATGATAGGT AAGTTTGTTT 


TTGTGTATTC 


1980 


30 


AATAAAACGA 


GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 


2040 




ACCTAATCTA 


TCCAACCAAT 


CAGCCGACCA TTGAAACAGT 


GGTGCTAATT 


GCGTGAATAC 


2100 




ATTGACTAAT 


CCGTCACCAA 


AACCACCTGC AGCACTTAAT 


AGCTTGTTAA 


ATACCGAAAC 


2160 


35 


ACCCGTTGTA 


TTCATCATAT 


TAAAGAATCT TGAAGCTACA 


CTGCTATTTT 


CAGCCCATTT 


2220 




AAGCACGCTT 


TGAGACGCTT 


CTTCCATTCC TCTTGAAATA 


CCACTAAAAA 


ACGGTTGTAA 


2280 


40 


GCTCTGCATT GCAGTTTTAA CAGTATTTAA ACCATTTGCA AGAGTTGTGA 


AGATAGCGGA 


2340 


TTGATTTTGC 


TTTATAATAT 


CAGTCCATGC TGACTTTACG 


CCATCTAACG 


CTTTTTTGTA 


2400 




TTCGTTTGTT 


GCTGAGCTAG 


CTTGTAAAGT GCCATCATTA 


AGCATCTTTA 


TAGCGCTGAT 


2460 


45 


AGCCATTGCG 


CCAAACGCTA 


CAAATCCTGC TCCCGCTATT 


GCTACGGCAC 


CACCTAAAGC 


2520 




AAGTACACCA 


CCAGTTAACA 


CTTTGATAGC GTTTAATAGC 


GCAAATACTA 


CAGGTACTAC 


2580 




GCTCGCTATT 


ACAGGTATTA AGATACTAAA AGATGATGTA 


AGTAATCCAC 


CAACCATATT 


2640 


50 


AGAACCTACA 


GTACCGAACA 


CACGGAACAT ATT AG CT AAA 


TTCCCCATCT 


GTCTTTGAAA 


2700 




ATTGTCATTT 


GCTTTTATTA 


TGTAGGCATA AGCTTTCTTT 


AAACCATTAG 


TATCGACATC 


2760 




TACCTTTGTT 


GTTTTTTTGT 


TCGGCAATGC GTCTAATGAT 


TTTTTAAACG 


CATAAATAGT 


2820 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA 
AGCTTTGGCT TTAGACCTAT TTAATGCTTC 
ATTGAATTTA CTGTTATCTG CATTGACGTC 
TAATTTAGCT TCTGTTTCAG CGATATCTTT 
TGGTGTAACT TCTTTAGAGT TTAGTTTGTC 
TTGTAAATCT TGTATACTAG CATCTAATTT 
TAAAGACTTT TTAGCAACTT TGATAGTTTT 
AACATCTTTA GTTTGATCTG CTACTCGTTT 
AATTTGCCTT TTGAATTTGG CTACACTAGC 
CACATTAACA CCTCTCTTTC TATTGCTTAT 
TATTTTGTGG TTCGTATTCA TCACGTTCGC 
GCCGTTGGAT ATTTTCTTCA TAAGGCAATA 
CTTTAGGTTT ATTTTCTGTC CCAACATTTT 
CAAGTTTGTA ACGTTCGAAT TCTTGGGTTA 
AGTTATATTC TGTTAATGTC ATTTGCTCAA 
ACATACAAGT TATAACGATT CTGTCGTAAG 
TTCCACTACT TCGACTAGGT TTCGGGTCAT 
CGAACCGAAT TCTTCTAGTC CGATATTTTC 
ATTAATAGTA ATTGCTTGTT TTTTTAAGTG 
CACAACCGGA TTTCCACTTT CTAAACCTAC 
AGCTTGTTCA ACTTTTAAAC CTAATCGGTT 
ACTTAATTCT AATGACTTTC CGTTAATTTC 
ATTAATTTAA ACAAAATAAA mArGCTTAAC 
CGGTGGTGAA TCTACTTTAG GTTGTGGAAT 
TGCTTTTGTA GTGTCGTGGA ATCTGTATcC 
AGGTAGTGTT GCAAATCCAC GTTGGAAACG 
ATCAATACCG TTAGCTTCTG CTTTTAATTC 
CGCTTTAAAT TTAGCGGAAT CCCCATTTTT 
TTCATACAAT ACGCGATCTA CAACTGCATC 



GTTAGCAACA CCATTGTCCA CGTCTATAAT 
GAGACTAGCT TTAGATACTT TTAACACTCG 
AATATTGACA CGTTTCTTTT CTAATTCTGA 
AATCAACTTT TGTTTTTGCA ACTTAACTTC 
TAGTTCAAAA TTCGATTCTA GTACCTTTTG 
AGCTTTTACA TTTTTGTTAC TAAAGGCATC 
TTGTAAATTT TTATCGTTAG CGTTTAATTC 
AAATCTTTGC ACAGACTTAA CCGCACTATC 
TTCAATAGTC GCTTTAATTT TATATTCCGT 
TAAATTCTGC TATAACTTTA AAGAATTCAT 
TACTAAATCT TATATCTTTA CCTTCGTTAA 
CGTCGTTTGC ATTGTTAAAA ACATATTCCT 
TAGTAGCTGC AGCATCACGA ATAGCAAACG 
GCATTTCATA CTCTTTCGCA TACATTCGAT 
TAACGTTCAA ATCTGTAATA CCAAGTGTTG 
TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 
AGGTCGCTTT CCCAAcTCCG TTAAAATATC 
TGCGATTTCA TCTAATGCTT CATCAATGTT 
AGATGTAGCT GCGATTAAAA cTTCGCCAAT 
AGGCAACATT GATACACCTT GACCGATAGA 
ATCGATTTCT CTTAAAAATT TAAAACCAAA 
TACATTCATA ACTTAAAATC TCCATTCATA 
GCCCTATTTT TATACCTCTC TTGGTGCAAC 
TGCTGTTAAA TCTTCGCCAG TTAATGCATC 
AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 
ACCATTCACT CCATATTCAT ATTCATATTC 
AAATTTATTG TGGAAACCTT GGAAATATTT 
GCCTGGTATT CTACTTTCAA 'CTTCCCAAGC 
TTCAATTTCA TCTGCAAAAT CGTCACCATA 
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GTCCATTGTA TCCTCTGTAT CTGTATCAGC TTCATGTGAT AAGCCGTATT CAGTTAAAAA 4740 

AAGCATTTTA GTAGCATCTA CTTTTTCGCC AGCTTTTCTA AATAAAATAA TACGATCATT 4 800 

5 

ACTATTTTTC ATATTTGCCA TTCAATATTC CTCCGTTTTT TAAAATGTTT TGTAAGATAT 4 860 

CGTTACTGAT GTGTGTAGCA . ATTCTTGATT GGTAGTATCA TCAACTAACT GTGTGATGTT 4 920 

AGTATCTTCT TCTTCAAAGT CATAATCGTT TGTTTTAACG CTAGGTGTTA AATCATCAAT 4 980 

70 

ACATCTTTTA ACAAGTCCGT CATGATGTCC TAAATCATCG CTTACACTCC AAATATCAAT 5040 

AACTAAATTC GTATCGCCAG AATAACTATC AAACGTGTAC TTACTTCTAT TTGACTCCGG 5100 

15 CATTTTTATT ACAAAAAAAG GATACGGAAT CTCTTGTTGC ATCTCTTTAC GAGAAATAAC 5160 

AGGGAATCCA TATCCTTGTA GCGTTTCATA CGCTTTATTA TAAAGTTGTA AGTTCGGTGT 5220 

CATGCTTTTA TCTCCTATTC AAACAACGCT TTCAATTCTT CTACAGTTGA TTTCCTAATC 5280 

20 ACTTCGTATA CCGGCCACAT AAAAGGTTCA GCCTCCATGT ATCGAGTACC AAATTCTAAG 5340 

AAACCACTAT AAGCTGCGTG CGATGTGATA GTGTATTGCA AATCGCCAGT TTTTTTATAT 5400 

CTGATATTGC GTGATaAATT ACC 5423 

25 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6251 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AAACGCAGAT GTTCAATTAG AACCAGTCTA TCGTATTAAG GAAGGTATTA AACAAAAGCA 60 

AATACGAGAC CAAATTAGAC AAGCGTTAAA TGATGTGACA ATTCATGAAT GGTTAACTGA 120 

40 TGAACTAAGA GAAAAATATA AATTAGAGAC CTTGGACTTT ACTTTGAACA CATTACATCA 180 

TCCTAAAAGT AAAGAGGATT TATTACGTGC TCGTAGAACC TATGCATTTA CTGAACTGTT 24 0 

TTTATTCGAA TTACGTATGC AATGGCTAAA TAGATTAGAA AAGTCATCTG ACGAAGCAAT 300 

45 

TGAAATTGAT TATGACATAG ACCAAGTTAA ATCATTTATT GATCGTTTAC CTTTTGAACT 360 

AACTGAAGCA CAGAAATCCA GTGTTAATGA AATTTTTAGA GATTTAAAAG CACCAATACG 420 

SO TATGCATCGA TTACTTCAAG GTGATGTAGG TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 480 

TATGTATGCG TTAAAAACTG CTGGTTATCA ATCAGCATTG ATGGTACCAA CTGAAATTTT 54 0 

AGCAGAGCAA CATGCTGAAA GTTTAATGGC TTTATTTGGA GATTCTATGA ACGTTGCATT 600 
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TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTCCATAA 720 

TGTTGGTTTA GTAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GCCAGCTTTT 780 

5 

AAGAGAAAAA GGTGCAATGA CGAATGTGTT ATTTATGACA GCAACGCCGA TACCAAGAAC 840 

ACTAGCAATA TCAGTTTTTG GTGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 900 

TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 960 

10 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 1020 

TTCTGAGCAT CTCGAAGATG TTCAAAATGT TGTCGCATTG TACGAGTCTT TACAACAGTA 1080 

15 TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 1140 

GGTCATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA GTTTCTACTA CTGTTGTTGA 1200 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 1260 

20 ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 1320 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 1380 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 1440 

25 

CTTTGGTGTT AAACAAAGTG GaTTGCCAGA TTTCTTAGTT GCCAATTTAG TTGAAGATTA 1500 

TCGTATGTTA GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 1560 

3Q TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC ATCGTAGTTT 1620 

TGACTAATTG CCATGCTGAT TTGTCAATTT GAGTGCAACa CTTCGTTAAT TGAGTGATAT 1680 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TTCGACTAAA 174 0 

35 TAATAGCTAA ATATTACAGT TATTTGTTGA GTCGGTTAAA TAGAAAGTGT TATGATATGT 1800 

GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 1860 

ATAWVCGTAG AGAAGCAATC AGACAACAAA TTGATAGCAA TCCCTTCATC ACAGACCATG 1920 

40 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 1980 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 2 04 0 

GTTCTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

45 

CGCAATCAAT TTTAGATATT ACATCGGATT CTGTTTTTCA TAAAACTGGA ATTGCGCGTG 2160 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCAACAG 2220 

SO TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 2280 

GAGCAGAAGC ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 2340 

ATGTTAAACA TACATTAGTT TTCAAAGGAA ATTTTAAAAT GTTTTATGAT AAGCGAGGAT 2400 
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TTAGAAGCCG TACAAAAGGC TGTTGAAGAC 
GACGAAAAAA AGTATAATCT GAACCATGAA 

5 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG 
AAAATGGCTG AAGCTGTGAA ATCTGGTGAA 
GGTGCTTTAA TGTCAGCTGG TTTATTCATT 

10 

GCTTTAGTAG TAACATTGCC AACGATTGAT 
GCAAATGCTG ATGCTAAACC TGAACACTTA 

15 GCTCAAAAAA TTAGAGGTAT TGATAATCCG 
CCAGCTAAAG GTAATAGTTT AACGAAAAAA 
TTGAATTTTG TTGGGAATAT TGAAGCGAAG 

20 GTTACCGATG GCTATACTGG GAACATGGTC 
ATCGGTAAAA TGTTAAAAGA TACGATTATG 
ATATTGAAGA AAGATTTAGC TGAATTCGCT 

25 

TCCGTATTAT TAGGATTGGA AGGTACTGTA 
GCTTTTTATT CTGCAATTAG ACAAGCGAAA 
ATGAAAGAGA CTGTAGGTGA AtCAAATGaG 

30 

TGCCCAAAAA GTTGGTATGG CGCAAGATTT 
TTTAACTTCA GCAGCGAACA CATTAGACTT 
35 AGAAGGTAAA TTGGGTGAAA CTGAAAACAC 
ATTATTAGCA GCGCTAAAAA ATTTGAATCC 
ATATTCAAGT TTAGTTGCAG CTGACGTATT 

40 

AAAACGTGGT CAATTAATGG CGCAAGCATT 
ATTGGGATTA GATTTTGATA AAGTCGATGA 
AATAATTGAA CCAGCAAACA TTAATTGCCC 

45 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA 
AGCAGTATCT GGACCATTCC ATTCATCGCT 
So TTACATTAAT CAATTTGAAT GGCGTGATGC 
GCAAGGTGAA ACTGACAAAG AAGTAATTAA 
AGTACAATTC ATTAACTCAA CAGAATGGCT 

55 



TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2520 

CGAATCGAAT TTAGACATTG TTCTGAAAAG 2580 

ATTAAACGTA AAAAAGATAG CTCAATGGTA 2640 

GCAGATGGAT GTGTGTCAGC AGGTAATACT 2700 

GTTGGACGTA TTAAAGGTGT AGCTAGACCG 2760 

GGAAAAGGTT TTGTCTTTTT AGACGTTGGT 2820 

TTACAGTATG CGCAACTAGG GGATATTTAT 2880 

AAAATCTCAT TATTAAATAT AGGAACCGAG 2940 

TCATATGAGT TATTAAATCA TGATCATTCA 3000 

ACATTAATGG ATGGCGATAC AGATGTTGTA 3060 

CTTAAAAATT TAGAAGGTAC TGCAAAATCA 3120 

AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

AAAAAGATGG ATTACTCAGA ATACGGTGGT 3240 

GTTAAAGCAC ACGGTAGTTC AAATGCTAAA 3300 

ATCGCAGGAG AACAAAATAT TGTACAAACA 3360 
TaAAACAGCA ATTATTTTTC CGGGACAAGG • 3420 

GTTTAACAAC AATGATCAAG CAACTGAAAT 3480 

TGATATTTTA GAGACAATGT TTACTGATGA 3540 

ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3600 

TGATTTTACT ATGGGGCATA GTTTAGGTGA 3660 

ATCATTTGAA GATGCAGTTA AAATTGTTAG 3720 

TCCTACTGGT GTAGGAAGCA TGGCTGCAGT 3780 

AATTTGTAAG TCATTATCAT CTGATGACAA 3840 

AGGTCAAATT GTTGTTTCAG GTCACAAAGC 3900 

ATCATTAGGT GCAAAACGTG TCATGCCTTT 3960 

AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4020 

TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 4080 

ATCTAATATG GTCAAGCAAT TATATTCACC 4140 

AATAGACCAA GGTGTTGATC ATTTTATTGA 4200 
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AACATCAATT CAAACTTTAG AAGATGTGAA AGGATGGAAT GAAAATGACT AAGAGTGCTT 4320 

TAGTAACAGG TGCATCAAGA GGAATTGGAC GTAGTATTGC GTTACAATTA GCAGAAGAAG 4380 

5 

GATATAATGT AGCAGTAAAC TATGCAGGCA GCAAAGAGAA AGCTGAAGcA GTAGTCGAAG 4440 

AAATCAAAGC TAAAGGTGTT GACAGTTTTG CGATTCAAGC AAATGTTGCC GATGCTGATG 4 500 

AAGTTAAAGC AATGATTAAA GAAGTAGTTA GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4560 

10 

ATAATGCAGG TATTACTCGC GATAATTTAT TAATGCGTAT GAAAGAACAA GAGTGGGATG 4620 

ATGTTATTGA CACAAACTTA AAAGGTGTAT TTAACTGTAT CCAAAAAGCA ACACCACAAA 4680 

J5 TGTTAAGACA ACGTAGTGGT GCTATCATCA ATTTATCAAG TGTTGTTGGA GCAGTAGGTA 4740 

ATCCGGGACA AGCAAACTAT GTTGCAACAA AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4800 

CGGCGCGTGA ATT AG CAT CT CGTGGTATCA CTGTAAATGC AGTTGCACCT GGTTTTATTG 4 860 

20 TTTCTGATAT GACAGATGCT TTAAGTGATG AGCTTAAAGA ACAAATGTTG ACTCAAATTC 4 920 

CGTTAGCACG TTTTGGTCAA GACACAGATA TTGCTAATAC AGTAGCGTTC TTAGCATCAG 4 980 

ACAAAGCAAA ATATATTACA GGTCAAACAA TCCATGTAAA TGGTGGAATG TACATGTAAT 5040 

25 

ATATTTGAGC TAAAGCTCAT TGACGCAGTG GTTGACTGGT CATCCAATGG AGAATTGTCT 5X00 

GACCTAGTCA ACTTTGCGGG GGAAATTCTA AGCAACCTAG ATAAGGTTCC AGAATTTCTC 5160 

CCTAAGAAAC ACTAATCAAT aAATTGwTAA GTGTTTCTAA AATTTCTACT TGTTTTTTAG 5220 

30 

AATTTAAAAT GGGAAAATAT AGTAGTCTAT GTATAGGCAT TTTTAAAGGA GGTGAATCGA 5280 

CGTGGAAAAT TTCGATAAAG TAAAAGATAT CATCGTTGAC CgTTTAGGTG TAGACGCTGA 5340 

35 TAAAGTAACT GAAGATGCAT CTTTCAAAGA TGATTTAGGC GCTGACTCAC TTGATATCGC 5400 

TGAATTAGTA ATGGAATTAG AAGACGAGTT TGGTACTGAA ATTCCTGATG AAGAnGCTGA 54 60 

AAAAtfTCAAC ACTGTTGGTG ATGCTGTTAA ATTTATTAAC AGTCTTGAAA AATAATAAAT 5520 

40 

CTTACATCTG GGTCGTCAGT ATTGTCGACT CAGTTTTTTT CTTTAATTAT CAATAGTTTT 5580 

AACGTAAAAT TAAAGATGAT TCAAGAGCAA CACATAAAGG AGATAAAATA ATGTCTAAAC 564 0 

AAAAGAAAAG TGAGATAGTT AATCGTTTTA GAAAGCGCTT TGATACTAAA ATGACAGAGT 5700 

45 

TAGGCTTTAC TTATCAAAAT ATTGATTTAT ACCAACAAGC ATTTTCGCAT TCGAGTTTTA 5760 

TTAATGATTT TAATATGAAT CGTTTAGACC ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5820 

50 CGGTATTAGA ATTGACGGTT TCACGATATT TATTTGATAa ACATCCCAAC TTGCCAGAAG 5880 

GGAATTTAAC AAAAATGCGT GCCaCTATTG TATGTGAGCC CtCACTkGTA ATATTTGCGA 5940 

ATAAAATTGG ATTGAACGAA ATGATTTTAC TTGGTAAAGG TGAAGAGAAA ACAGGGGGAC 6000 
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ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 



€120 



AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC 



6180 



AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCGTC 



6240 



TATTCACTTC A 



6251 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4920 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ACCTACTGAA GTTGCTAATT TTTTGGAGCA ACTAAGCACT GAAATTGAAC GTCTTAAAGA 60 

AGATAAAAAA CAACTTGAAA AAGTAATCGA AGAGAGaGAT ACTAATATTA AGTCTTATCA 120 

AGACGTGgCA TCAATCTGTA AGTGaTGCTT TGATACAAGC TCAAAAAGCT GGTGAAGAAA 160 

CTAAGCAAGC TGCAGAGAAA CAAGCTGAAG CGATTATAGC TAAGGCAGAA GCGCAAgcTA 24 0 

ATcAAATGGT TGGTGACGCG GTAGAAAAAG CACGCCGTTT AGCATTCCAG ACTGAAGATA 300 

TGAAACGTCA ATCAAAAGTA TTTAGATCGC GTTTCCGTAT GTTAGTTGAA GCGCAATTAG 360 

ACTTATTAAA AAACGAAGAT TGGGATTACT TGTTGAATTA TGATTTAGAC GCTGAACAAG 420 

TGACGCTTGA AAATATTCAT CATTTGCATG AAAATGATTT AAAGCCAGAT GAAGTTGCAG 480 

CAAATGCACA AAATAATGCA TCAAATACAC CAGACAATAA TCAACAATCC AATGATTCAG 540 

AAACAACTAA GAAGTAAGAA TTAAATAAAG ACAGACGCGT AATATACATT TAACTTTTCA 600 

CAGCGAATTA GGTAATGGTG AGAGCCTAGT AAAAGCATGT ATGTTATATC ACTGGCTTTT 660 

TAATATTTAA ATAATGTAAT GAGAGAACTC TAAGTTGAGT TAATAAGGGT GGTACCGCGA 720 

GCAATCGTCC CTTTTAATTT AACTTAGAGT TTTTTAAATT TTTAAGGAGT GAAAAAAATG 780 

GATTACAAAG AAACGTTATT AATGCCTAAA ACAGATTTCC CAATGCGAGG TGGTTTACCA 840 

AACAAGGAAC CGCAAATTCA AGAAAAATGG GATGCAGAAG ATCAATACCA TAAAGCGTTA 900 

GAAAAAAATA AAGGTAACGA AACATTCATT TTACATGATG GCCCACCATA CGCGAATGGT 960 

AACTTACATA TGGGACATGC CTTGAACAAA ATTTTAAAAG ACTTTATTGT ACGTTATAAA 1020 

ACTATGCAAG GGTTCTATGC ACCATACGTA CCAGGTTGGG ATACACATGG TTTACCAATT 1080 

GAACAAGCAT TAACGAAAAA AGGTGTTGAC CGAAAGAAAA TGTCAACAGC TGAATTCCGT 1140 
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10 



20 



TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 1260 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 1320 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 1380 

GATAAACGTT CAGCATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 144 0 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 1500 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATGTAAATGG CGAAAAATAT 1560 

ATTATTGCAG AAGCCTTGTC TGACGCTGTA GCAGAAGCAC TGGaTTGGGA TAAAGCATCA 1620 

15 ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 1680 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 1740 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 1800 

GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 1860 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 1920 

GGTGCACTAT TAAAATTAGA CTTTATTACA CATAGCTATC CACACGACTG GAGAACAAAA 1980 

AAACCTGTAA TCTTCCGTGC TACACCACAA TGGTTTGCCT CAATCAGTAA AGTAAGACAA 204 0 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACGTGT GTGGGGTGTA 2160 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATTATCA TGACGAAAGA AACAGTGAAT 2220 

CATGTTGCTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 2280 

35 GACTTACTAC CAGAAGGATT TACACATCCA GGCAGCCCTA ACGGTACATT TACTAAAGAA 234 0 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGGCGT GTTGGAAACA 2400 

AGAGCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 2460 

TGGTTCAACT CTTCTATCAC AACTTCAGTT GCTACAAGAG GAGTATCACC TTATAAATTC 2520 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2580 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 264 0 

GTAAGTAGTA CGGACTATTT AGCTGATGTT AGAATTTCTG ATGAAATTTT AAAACAAACA 2700 

TCTGATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 2760 

SO TTCAATCCTG ACACAGATAG CATTCCTGAA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2820 

CTAAATCGTT TACGTGAATT TACTGCAAGT ACGATTAACA ACTATGAAAA CTTTGACTAC 2880 

TTAAATATTT ATCAAGAAGT TCAAAACTTT ATCAATGTTG AGTTAAGTAA TTTCTATTTG 2940 
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CAAACAGTGT 


TATATCAAAT 


TTTAGTTGAT 


ATGACGAAGT 


TGTTAGCACC 


AATCTTAGTG 


3060 




CATACAGCTG 


AAGAAGTTTG 


GTCTCATACA 


CCACATGTTA 


AAGAAGAAAG 


TGTTCACTTA 


3120 




GCAGACATGC 


CTAAAGTTGT 


AGAAGTAGAT 


CAAGCTTTAT 


TGGATAAATG 


GCGTACATTT 


3180 




ATGAATTTAC 


GTGATGATGT 


GAACCGTGCA 


TTAGAAACTG 


CTCGTAATGA AAAAGTTATT 


3240 


10 


GGTAAATCAT 


TAGAAGCTAA 


AGTTACGATT 


GCTAGTAACG 


ATAAATTTAA 


TGCATCTGAA 


3300 




TTCTTAACTT 


CATTTGATGC 


ATTACATCAA 


TTATTTATCG 


TGTCACAAGT 


TAAAGTTGTA 


3360 




GATAAGTTAG 


ACGATCAGGC 


AACAGCTTAT 


GAACATGGTG 


ATATTGTCAT 


CGAACATGCA 


3420 


15 


GATGGTGAAA AATGTGAAAG 


ATGTTGGAAC 


TATTCAGAGG 


ATCTTGGTGC 


TGTTGATGAA 


3480 




TTGACGCATC 


TATGTCCACG 


ATGCCAACAA 


GTTGTAAAAT 


CACTTGTATA 


ATTGAAATTG 


3540 




TATAAAGTAC 


TCATACAGAT 


GATATAAATT 


AAAGCTCTCT 


TCATAATCAT 


GTTGTAGTTT 


3600 


20 


TTGTTGACAT 


GATGAAGAGA 


GTTTTTTTGT 


GAATAAAAAA 


ATGACCAAGT 


TACCGGTCAT 


3660 




ATATGTAAAA 


AATGTGCGAT 


TTACTAAAAT 


AAAAATTATT 


CAGGAATGGT 


ACAAATTCTC 


3720 


25 


TGAGGCATAT 


AAATGCGTTA 


TAGTTGCTAT 


TCTCAATTAT 


GTTCGCGATA 


ATTTTAAGTA 


3780 


AAAGTAAGCA 


CAGATATTGA 


ATTTGATAGG 


AGTTAATTGA 


ATGTATCATA 


ACAGTAACGC 


3840 




AAACTTTGTC 


AATGGTATCA 


CTTTAAATGT 


GAGAGATAAG 


AATGAATTAA 


AGCCATTTTA 


3900 


30 


TGAGGACATA 


TTAGGATTAA 


ATATTATAAA 


TGAGACATTA 


ACATCGATAC 


AATATGAAGT 


3960 




AGGTCAAAAT 


AATCATGTCA 


TTACACTTGT 


TGAATTACAA 


AATGGACGTG 


AACCTTTAAT 


4020 




GTCCGAAGCG 


GGACTGTTTC 


ATATCGCAAT 


TAAACTACCT 


CAAATTAGTG 


ATTTAGCTAA 


4080 


35 


TTTACTAATT 


CATTTAAGCG 


AATATGATAT 


TCCAGTTAAC 


GGAGGTATAC 


AGCCTGCTTC 


4140 




GTTATCATTA 


TTTTTTGAAG 


ACCCGGAAGG 


AAACGGTTTT 


AAATTTTATG 


TTGATAAAGA 


4200 


40 


CGAAGCGCAA 


TGGACGAGGC 


AAAATAATTT 


AGTAAAAATT 


GATATTAGAC 


CATTAAATGT 


4260 


ACCGAGATTA 


GTGAGTCATG 


CAACAAAATT 


GTTATGGTTA 


GGTATTCCAG 


ATGACGCTAT 


4320 




TATAGGTGCA 


TTGCATATTA 


AGACAATTCA 


TTTATCAGAG 


GTAAAAGAGT 


ACTACCTCGA 


4380 


45 


TTATTTTGGA 


TTAGAGCAAT 


CGGCATATAT 


GGATGATTAT 


TCAATATTTT 


TAGCATCGAA 


4440 




TGGCTATTAT 


CAACATTTGG 


CCATGAATGA 


TTGGGTATCA 


GCAACGAAAC 


GTGTAGAAAA 


4500 




TTTTGATACG 


TATGGATTAG 


CAATTGTTGA 


CTTTCATTAT 


CCTGAAACAA 


CACATTTAAA 


4560 


50 


TTTACAAGGT 


CCGGATGGTA 


TCTATTATCG 


CTTTAATCAT 


ATCGAAGTTG 


AAGATTAGTA 


4620 




TATACTTTGA ATGGACGAAC 


CATATAATGA 


ATCGTTTTTA ATGATCTTTT 


TATACAAGTT 


4680 




ATGAAGGAGG 


CTGGGACATT 


AAGTTCTTAG 


GCAATGTAAA 


AAGCTGATTT 


CTATTAATTA 


4740 
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TTTTCCTTAT ATTAATTGCC ATTAATACAA AACCTAGCTC TCGTTTAACT TTATTTATTC 4 860 

CTCGAACTGA CATTCGnGTG AACTCAAAAT nGCCTACTTn CTTAAATTAC CAATATCTAT 4920 

5 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION; SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGGTAT 60 

CGCTTTTGCA AGTAAACCAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 120 

20 

TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 180 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 240 

AGGAACACAG TCTCCATCAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 300 

25 

GTTTGAAATG GGTAGAGAGT TGCTTGTTCA AGTTCGTAAG CGTCAATTAA AAGCGAAAAT 360 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTGCATTGTG AGTCATCGTT 420 

30 ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATTGGT GCAAAATCTA 480 

TAGGAGAATT ATTCGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 540 

CTGAAGGCGA ACGGAGAACA ACGTTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 600 

35 TCACTAGAGG AACGCGTACA TCGTTT 626 

(2) INFORMATION FOR SEQ ID NO: 28: 

r(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
nGGAAGTGGT GTATATATTT GTAATGAGTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 60 
- 50 AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATTAT 120 
GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAGCTGTAGC 180 
TGTTTATAAC CACTATAAGC GTATTCAACA ATTAGGACCA AAAGAAGATG ATGTTGAATT 240 
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5 



10 



25 



AACCTTAGCC 


AAGACGTTGA 


ATGTACCATT 


TGCAATTGCA 


GATGCGACAA GTTTAACTGA . 


360 


AGCTGGTTAT 


GTAGGCGATG 


ATGTTGAAAA 


TATCTTGTTG 


AGATTAATTC AAGCAGCTGA 


420 


CTTTGACATT 


GATAAAGCCG 


AAAAAGGTAT 


TATTTATGTA 


GATGAAATTG ATAAAATTGC 


480 


ACGTAAATCT 


GAAAACACAT 


CTATAACACG 


TGACGTTTCA 


GGTGAAGGTG TTCAACAAGC 


540 


ATTGCTTAAA 


ATCTTAGAAG 


GTACGACTGC 


AAGTGTTCCG 


CCACAAGGTG GACGCAAACA 


600 


TCCAAACCAA 


GAAATGATTC 


AAATTGATAC 


AACAAATATC 


TTATTTATTC TTGGTGGTGC 


660 


CTTTGATGGT 


ATTGAAGAAG 


TGATTAAGCG 


CCGTCTTGGT 


GAAAAAGTTA TTGGTTTCTC 


720 


AAGCAATGAA 


GCTGATAAAT 


ATGACGAACA 


AGCATTATTA 


GCACAAATTC GCCCAGAAGA 


780 


TTTGCAAGCC 


TATGGTTTGA 


TTCCTGAATT 


TATCGGACGT 


GTGCCAATTG TAGCTAATTT 


840 


AGAAACATTA 


GATGTAACTG 


CGTTGAAAAA 


CATCTTAACG 


CAACCTAAAA ATGCACTTGT 


900 


GAAACAATAT 


ACTAAAATGC 


TGGAATTAGA 


TGATGTGGAT 


TTAGAGTTCA CTGAAGAAGC 


960 


TTTATCAGCA 


ATTAGTGAAA 


AAGCAATTGA 


AAGAAAAACA 


GGTGCGCGTG GTTTACGTTC 


1020 


AATCATAGAA 


GAATCGTTAA 


TCGATATTAT 


GTTTGATGTG 


CCTTCTAACG AAAATGTAAC 


1080 


GAaGGTAGTT 


ATTACAGCAC 


AAACmATTAA 


TGrAGaACTG 


AACCAG 


1126 



{2) INFORMATION FOR SEQ ID NO: 29: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATTGACTTCT TAGCAATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 60 

40 - 

GTTGGTGAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAACTTA 120 

GAAGTAACTG CTACTCCAGA CAATATTCCA GAAGCAATCG AAGTAGACAT TACTGAATTA 180 

AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 240 

45 

AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA CTGAAGAACC AACTGAAGAA 300 

GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGGCGAAAGC 360 

SO AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 4 20 

TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 4 80 

AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 540 
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70 



20 



25 



CTTACTAAGC TAAAGAATAA TGATAATTGA TGGCAATGGC GGAAAATGGA TGTTGTCATT 660 

ATAATAATAA ATGAAACAAT TATGTTGGAG GTAAACACGC ATGAAATGTA TTGTAGGTCT 720 

AGGTAATATA GGTAAACGTT TTGAACTTAC AAGACATAAT ATCGGCTTTG AAGTCGTTGA 780 

TTATATTTTA GAGAAAAATA ATTTTTCATT AGATAAACAA AAGTTTAAAG GTGCATATAC 84 0 

AATTGAACGA ATGAACGGCG ATAAAGTGTT ATTTATCGAA CCAATGACAA TGATGAATTT 900 

GTCAGGTGAA GCaGTTGCAC CGATTATGGA TTATTACAAT GTTAATCCAG AAGATTTAAT 960 

TGTCTTATAT GATGATTTAG ATTTAGAACA AGGACAAGTT CGCTTAAGAC AAAAAGGAAG 1020 

15 TGCGGGCGGT CACAATGGTA TGAAATCAAT TATTAAAATG CTTGGTACAG ACCAATTTAA 1080 

ACGTATTCGT ATTGGTGTGG GAAGACCAAC GAATGGTATG ACGGTACCTG ATTATGTTTT 1140 

ACAACGCTTT TCAAATGATG AAATGGTAAC GATGGAAAAA GTTATCGAAC ACGCAGCACG 1200 

CGCAATTGAA AAGTTTGTTG AAACATCACG ATTTGACCAT GTTATGAATG AATTTAATGG 1260 

TGAAGTGAAA TAATGACAAT ATTGACAACG CTTATAAAAG AAGATAATCA TTTTCAAGAC 1320 

CTTAATCAGG TATTTGGACA AGCAAACACA CTAGTAACTG GTCTTTCCCC GTCAGCTAAA 1380 

GTGACGATGA TTGCTGAAAA ATATGCACAA AGTAATCAAC AGTTATTATT AATTACCAAT 1440 

AATTTATACC AAGCAGATAA ATTAGAAACA GATTTACTTC AATTTATAGA TGCTGAAGAA . 1500 

30 TTGTATAAGT ATCCTGTGCA AGATATTATG ACCGAAGAGT TTTCAACACA AAGCCCTCAA 1560 

CTGATGAGTG AACGTATTAG AACTTTAACT GCGTTAGCTC AAGGTAAGAA AGGGTTATTT 1620 

ATCGTTCCTT TAAATGGTTT GAAAAAGTGG TTAACTCCTG TTGAAATGTG GCAAAATCAC 1680 

35 CAAATGACAT TGCGTGTTGG TGAGGATATC GATGTGGACC AATTTCTTAA CAAATTAGTT 1740 

AATATGGGGT ACAAACGGGA ATCCGTGGTA TCGCATATTG GTGAATTCTC ATTGCGAGGA 1800 

GGTMTATCG ATATCTTTCC GCTAATTGGG GAACcAATCA GAATTGAGCT ATTTGATACC 1860 

GAAATTGATT CTATTCGGGA TTTTGATGTT GAAACGCAGC GTTCCAAAGA TAATGTTGAA 1920 

GAAGTCGATA TCACAACTGC AAGTGATTAT ATCATTACTG AAGAAGTGAT CAGCCATCTT 1980 

AAAGAAGAGT TAAAAACTGC ATATGAAAAT ACAAGACCCA AAATAGATAA ATCAGTGCGC 2 040 

AATGATTTGA AAGAAACGTA TGAAAGCTTT AAATTATTCG AAAGTACATA CTTTGATCAT 2100 

CAAATACTAC GTCGCTTAGT AGCGTTTATG TATGAAACAC CTTCGACAAT TATTGAGTAT 2160 

50 TTCCAAAAAG ATGCAATCAT TGCAGTTGAT GAATTTAATC GTATTAAAGA AACTGAAGAA 2220 

AGTTTAACAG TAGAGTCTGA TTCGTTTATT AGCAATATTA TTGAAAGTGG TAATGGATTT 2280 

ATAGGACAAA GTTTTATAAA ATATGATGAT TTTGAAACAT TGATTGAAGG CTATCCTGTC 2340 
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TCATGTAAAC CTGTCCAACA 


ATTTTATGGG 


CAATATGACA TTATGCGTTC 


TGAATTTCAA 


2460 




CGATATGTTA ATCAAAACTA 


TCATATCGTG 


GTTTTGGTCG AAACCGAAAC 


TAAAGTTGAA 


2520 


5 


CGTATGCAAG CGATGTTAAG 


TGAAAtGCAT 


ATTCCATCAA TAACAAAATT 


GCATCGCTCA 


2580 




ATGTCATCGG GGCAAGCAGT 


GATTATTGAA 


GGCAGTTTAT CTGAAGGATT 


TGAACTACCT 


2640 


10 


GATATGGGAT TAGTTGTCAT TACTGAGCGT GAgcTTTTTA AATCAAAACA 


GAAAAAGCAA 


2700 


CGAAAACGTA CGAAAGCTAT 


CTCAAATGCT GAAAAAATTA AGTCTTACCA AGATTTAAAT 


2760 




GTGGGAGATT ATATTGTTCA 


TGTGCATCAT . 


GGTGTTGGTA GATATTTAGG 


TGTTGAGACG 


2820 


15 


CTCGAAGTGG GGCAAACGCA 


TCGTGATTAT 


ATTAAATTGC AATATAAAGG 


TACGGATCAA 


2880 




CTATTTGTTC CAGTAGATCA AATGGATCAA GTTCAAAAAT ATGTAGCTTC 


GGAAGATAAG 


2940 




ACGCCAAAAT TAAATAAACT 


CGGTGGCAGT 


GAATGGAAAA AAACAAAAGC 


TAAAGTTCAA 


3000 


20 


CAAAGTGTTG AAGATATTGC 


TGAAGAGTTG 


ATTGATTTAT ATAAAGAAAG 


AGAAATGGCA 


3060 




GAAGGTTATC AATATGGGGA 


AGACACAGCT 


GAGCAAACAA CATTTGAATT 


AGATTTTCCA 


3120 


25 


TATGAACTTA CGCCTGACCA 


AGCTAAATCT 


ATCGATGAAA TTAAAGATGA 


CATGCAAAAA 


3180 


TCGCGTCCAA TGGATCGCTT 


GCTATGTGGT 


GATGTTGGTT ATGGTAAAAC 


TGAAGTTGCA 


3240 




GTGAGAGCAG CATTCAAAGC 


TGTAATGGAA 


GGAAAGCAGG TTGCATTTTT 


AGTTCCTACA 


3300 


30 


ACTATTTTAG CTCAGCAACA 


TTATGAGACG 


TTAATTGAGC GTATGCAAGA 


TTTTCCTGTT 


3360 




GAAATTCAAT TAATGAGTCG 


TTTTAGAACG 


CCTAAAGAGA TAAAACAAAC 


TAAGGAAGGA 


3420 




CTTAAAACTG GATTTGTTGA 


CATAGTTGTT 


GGTACACACA AATTACTTAG 


TAAAGATATA 


3480 


35 


CAGTATAAAG ATTTAGGGCT GTTGATTGTA GATGAAGAAC AACGATTTGG 


TGTACGCCAT 


3540 




AAAGAGCGTA TTAAAACATT 


AAAACATAAT 


GTAGATGTAC TAACATTGAC 


TGCAACCCCA 


3600 




ATAGCTAGAA CATTGCATAT GAGTATGCTA GGTGTGCGGG ATTTGTCAGT 


GATTGAAACG 


3660 


40 


CCGCCAGAAA ATCGTTTCCC 


AGTTCAAACA 


TATGTATTAG AACAGAACAT 


GAGTTTTATC 


3720 




AAAGAAGCTT TAGAAAGAGA 


ACTATCCCGT 


GATGGCCAAG TGTTTTATCT 


TTATAATAAA 


37B0 


45 


GTGCAATCCA TTTATGaAAA 


ACGAGAACAA 


CTCCAGATGT TAATGCCAGA 


TGCTAACATT 


3840 


GCAGTTGCTC ATGGACAAAT 


GACAGAGCGC 


GATTTAGAAG AAACGATGTT 


AAGTTTTATC 


3900 




AATAATgAAT ATGATATTTT 


AGTAACGACG 


ACGATTATTG AAACAGGTGT 


CGATGTCCCA 


3960 


50 


AATGCAAATA CTTTGATCAT 


TGAAGATGCA 


GATCGCTTTG GATTGAGTCA 


GTTGTATCAA 


4020 




TTAAGAGGTC GTGTTGGTCG 


TTCAAGTCGT 


ATTGGTTATG CATACTTCTT 


ACATCCAGCA 


4080 




AATAAGGTAC TAACTGAGAC 


TGCAGAAGAT 


CGATTACAAG CGATTAAAGA ATTTACGGAG 


4140 
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TTAGGTAAAC AACAGCACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 



4260 



TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 



4320 



GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATGAACAA 



4380 



GCTAAAATTG AA 



4392 



10 



15 



20 



25 



30 



35 



40 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



TTTCTTTTGA 


ATCTATATCG 


AGGTGGTTGG 


TAGGTTCATC 


TAAAATAAGT 


ACATTGTCAC 


60 


GTTGCAACAT 


AAGTAGTGCT 


AGTTGTAAAC 


GTGCTTTTTC 


ACCACCAGAT 


AAATCATTAA 


120 


TTATCTTTTT 


AACATCGTCT 


TGTACAAATA 


AGAAACGTCC 


AAGAACTGCT 


CGAATATCTT 


180 


TTTCATTCAT 


TAACGGATAT 


TGATCCCACA 


CATAATCTAA 


AATCGTTTTA 


CTAGATTTAA 


240 


ATTCTGCTTG 


CTTTTGATCA 


TAATAACCAA 


TTTGTAAATT 


TGCGCCGAAA 


GTAATATCGC 


300 


CATTAAGCGC 


TTTTTGTTGA 


TTAGCAATAG 


TTTTAATTAA 


GGTCGATTTT 


CCAATACCAT 


360 


TTGGCCCAAT 


GATTGCTATA 


TGATCGCCTT 


TAGAGACCTC 


TATACTCATA 


GGTTTGGTAA 


420 


TTGCAGTTTG 


ATAACCGATT 


TCTAAATTTT 


TTACATGCAT 


GACGTCATTA 


CCTGTATTCC 


480 



GGTCAAAGCC AAATTGAATA TTTGCACTTT TGGCATCTAA CATTGGTTTA TCAATGCGTT 54 0 

CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 600 

TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 660 

GCATTCGTTT TTGATAATAT AAATCCCGTT GCTGTATAAA TTCCTCGTAA TTACCAACAT 720 

AGCGTTTGA 729 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13856 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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TGATGTTTCG 


ATACATTTGT 


TGCACCTTGT 


GGATATACTT 


TAAAGGTTGT 


GTCGTATGTT 


120 




TCCTTACTAT 


CTTTAGCTTC 


AGATTCCTGT 


GATTCAACCG 


TTTTATATTT 


TTCAAGTGCA 


180 


5 


TGTCCTTCAA 


TATCAACTCG 


TGGAATAATG 


CGATTCAACC 


ATGCTGGTAA 


ATACCACGAA 


240 




CCTTTtCCAA 


ACAATTTCGt 


TAATGCAGGA 


ATTAACATCA 


TtCTGACTAC 


GAAGGCATCA 


300 


10 


AAGAGTACAC 


CAAACGCTAA 


TGCCATACCC 


ATTGATTTAA 


TCATGACATC 


TTCTTGGAAT 


360 


ACAAACGCAA AGAAGACACT AAACATAATT 


AATGCAGCTG 


CTACAATAAC 


AGGACCGCTT 


420 




TCTTTCAATC CTACTTTGAT AGAATAATCA TTATCCCCTG TTTTACTATm yyCTTCATGr 


480 


15 


ATTCGCGACA 


TAAGGAAGAC 


TTCATAATCC 


ATCGCTAATC 


CAAATAAGAT 


ACCTATAGTA 


540 




ATAACCGGTA AAAATGCTAG 


CATTGGTCCT 


GTCGTTTCAA 


TACCAAACAG 


ACCTTTCATA 


600 




AAACCATCTT 


GCATTACTAA 


TGTTGTAAAT 


CCTAATGTTG 


CCATTAATGA 


CAAGACGAAT 


660 


20 


CCTAAAACTG 


CTTTTAATGG 


TATTAGAATT 


GAACGGAAGA 


CAATCATTAA 


TAAGAAAAAT 


720 




GCTAATACAA 


CAATGACTGA 


GGCAAATAAA 


GGTATCGCCT 


CATTTAACTT 


TTTAGACATA 


780 




TCAATATTAA 


TGACACTTTG 


TCCCGAAATC 


TCCGTTTTGA 


ACCCATATTT 


ATCTTGTGCA 


840 


25 


TCTTTATGAT 


AATCTCGTAA 


ATCATGCACT 


AAATCATTTG 


TACTCTCTGC 


ATTAGGCCCT 


900 




TGCTTAGGTA 


TCACGACCAT 


CAAAGCGTAA 


TCATTATCTT 


TACTCATTTG 


TGGTGGCGTA 


960 


30 


ACGATATCTA 


CATTTTTCTT 


ATCTTTAATA 


TCTTTATATA 


CAGACTGTAA 


ATCTTGTTGT 


1020 




AATCCTTGTG 


GATCATCCTT 


TTTATCTTTC 


ACATTTATCA 


ACATCGGTAT 


TTGG C CATTA 


1080 




AATCCTTCAC 


CAAATTTATC 


CGAGATAATA 


TCGTAAGCTT 


TTTTCTGTGT 


AGAATCTGCT 


1140 


35 


GGTTTAACAC 


CGTCATCTGG 


AATACCAAGT 


CGCATATGAC 


TAACTGGTAT 


TGCAGCTGCT 


1200 




ACTAATATGA 


TTAAACCTAG 


TAATACTGCC 


GCAAGTGCAT 


TTCCTGTAAT 


AAATTTAGAC 


1260 




CATGGCGTAT 


CAATATCTTT 


TTTGAATTTA 


GACTGTAATT 


TATTCACTTT 


AATGCGTTtA 


1320 


40 


TGGAAAATGC 


TTATTAATGC 


AGGTAATAAA 


GTTAAAGCGC 


TAAGTACTGC 


AAAAACAACA 


1380 




CTAATTGCCG 


AAGCAAATCC 


CATTACCGCT 


AAGAAGTCAA 


TGCCTACTAA 


TGATAAACCA 


1440 


45 


CATACTGCAA 


TTACAACTGT 


TACACCAGCA 


AAAACAACTG 


CACTACCTGC 


TGTTCCTATT 


1500 


GCAAGACCAA 


TGCCTTTAAT 


GTAATCTGTT 


TCAGTTTTCA 


TAACTTGTCG 


ATATCTGAAT 


1560 




AAAATAAATA 


ATGCATAATC 


GATACGAACT 


GCTAGTCCAA 


TCATTACGGC 


TAATGTCAGT 


1620 


SO 


GTGACATTTG 


GTATATCGAA 


TGCATAAGTT 


AACAAACTGA 


TAATACCTAC 


ACCAGAGGCT 


1680 




AGACCAATCA 


ATGCACTTAT 


AATTGGTAAT 


CCTGCAGCAA 


TGACTGAACC 


GAATGTGATT 


1740 




AACAGTACAA 


CAAATGCAAC 


AATAATACCA 


ACTAGTTCAG 


AATTACCGCC 


TACTTCTGTA 


1800 
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AAATGACTTT TAACATTATC TCTAGAGCCA TCTTTTAAAG ATGTTTGACT AACGTCATAT 1920 

GTGATATCTG CAAATGCAGT TGTTTTATCT TTACTAATTT GCTTATTTTC ATAAGGATCT 1980 

5 

GATATTTTAT CAATGTGCTT GTCATCTTTT TTAATATCAT CTAACGTTTT CTTAATATCT 204 0 

TTAGTAATGT TCGGTTGCAC AATACCATCA TCTTTAGTCG TCTTAAAGAC AACACGTATT 2100 

TGTGCCTTTT CACTATCTTG ATTAAAATGT TTTTCAATCT TTTTATTCGT ATCTAACGAC 2160 

10 

TCTAATCCTG TCATTTTAAT ATCATTGTCA AATTTCGGTG CATTTGTAGC AAGTGGTATC 2220 

AATATTGCAG CTACAATCAC TATCCATGCA ATGACCGCGG ACCATTTATG TTTTGCGATG 2280 

15 AATGTCCCCA TCTTATATAA AAATTTTGCC AAAGTATATT GCCTCCTTTT AAAATCAACG 234 0 

TTATAGTTTA AATATACAGT GTAGATTATT GTTCGATTAT AGTATCTATC CCCGACCTCT 2400 

TAAAGAATCA ATTGGAAAAT TTTGTATATT AAACTACACA CAAAGGAGAA ATGTAGATGA 2460 

on 

AAGAGACTGA TTTACGAGTT ATAAAGACAA AAAAAGCATT GTCGAGTAGC TTGCTACAAT 2520 

TGTTAGAACA GCAATTATTC CAAACGATTA CTGTCAATCA AATTTGCGAC AACGCACTCG 2580 

TACACCGTAC AACATTTTAT AAACATTTTT ATGATAAATA TGATCTTCTA GAGTACTTGT 264 0 

25 

TCAATCAATT GACTAAAGAC TACTTTGCTA GAGATATCAG TGACCGTCTT AAT CATC CAT 2700 

TCCAAACGAT GAGTGATACG ATTAATAATA AAGAGGATTT GAGAGAAATC GCAGAATTCC 2760 

30 AAGAAGAAGA CGCTGAATTT AATAAAGTAT TAAAAAATGT CTGCATTAAA ATTATGCATA 2 820 

ACGATATCAA AAATAATAGA GACCGTATCG ATATTGACAG CGACATCCCA GATAATCTCA 2880 

TATTTTATAT TTATGACTCG TTGATTGAAG GTTTTATACA TTGGATAAAA GATGAAAAAA 294 0 

35 TTGATTGGCC TGGCGAAGAT ATTGATAACA TTTTCCATAG ATTAATCAAT ATTAAGATTA 3000 

AATAGTAGAT GAGAAACTCA TGAGCGTTAC CAACATTCAT AATAAAAACG ATAGTGkACA 3060 

CGTTAATGAA TTCGTGTACT ACTATCGTTT TTTATTTTTA TCGTGCTTAT CGCTATTAAA 3120 

40 

ACAACTGATA CACAACACAT AAACTATGAA GAAAAAAATA AATCCGCTAT CTAAATGACT 3180 

TTGACTCAGT TGTTTAAATG ACCAAATTGC TAATACAATT CCCATTATTA TTGAAATAAC 324 0 

GTATCTCACA TTCTT A TACC TATAATCCTT TTCTAAAAAT ATGGTTGCTA TTACTTAATT 3300 

45 

TTTAAAGTTA TAAATAAAAA GAGCCAACCG CAATGGATGG CCCTTGTTCA TTATGAAGCA 3360 

TTAGAACATT TCTGAAACAA CCTTTTGTTC TAAGAAGTGT AATAAGTAGT CTGGACTACC 34 20 

50 TGTTTTAGCG TCCGTACCTG ACATTTTGAA ACCACCAAAT GGATGGTATC CAACAACTGC 34 80 

TGAAGTACAG CCTCTGTTAA GGTATAAATT GCCTACATCA AATTCGTTTA CCGCTTTAAT 354 0 

CCAATGCTCG CGATTATTTG TAATCACTGC ACCAGTTAAA CCGTAATCTG TATCATTTGC 3600 
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TTCTTCTTGC ATGATTCTAT CTTTAGATTT 
GTAAC CTTTT GAATCATCAG TGCCGCCACC 
5 CTCAATATAA TTTTTAATCT TATCAAATTG 

ATTGTCTACA GTATTGCCCA ACGTTAATTC 
TTCGTCATAA ACGTCTTTAT GCACAATTGC 

10 

AAAACCAAAT GCTGACGTTA CAATAGCTTC 
AACTACAATG GCATCTTTAC CACCCATTTC 

1S TTCTTGAACA ACGGCACTAC GTTCATAAAT 

TGTAACGAAA TGCGTATCTT TATGATCAAC 
AGGAACAAAG TTAACTACGC CTTTTGGTAA 

20 ATAAGCGATA TAAGGTGTAT CCTCAGCAGG 

TGGTGCTAAA GTTGTACCAG CCATAATCGC 
ACCTGTACCA ATTGATTTAT AGAAATATTT 

25 

CTTACCTTGA GCCAAGTCCA TCATTGAACG 
AGCTGCATCA CCAACTGCTT CATCCCATGG 
AATTTCCGCT TTTCGACGAC GAATAATTGC 

30 

ATTTGCTGAC CATGTTTTCC AAGATTTATA 
AACATCTTGT TTTGTTGCCT TTGATGCATT 
35 GATTGATTTA ATTTTGTCAT CTTTGAAAAT 

TTGACCTAAT TCTTTTTCCA CGTCTTTCAA 
GACTGAAAAA TCGTAACCAG GTTCATTTTT 
ATAAATTTTG AAAGTGGTTT AACCCTTTGA 
TTACTATGAT TAAGGTTAGT TTTGCAATCG 
CAAGTATTTT GAAATTGATT GGTTACTTTT 

45 

TATCGTTTCG TCATTTAATG TTTCGGATGG 
ACAAGGGTTT CCAACCGCTA AGCTGTGTGG 
50 ACCAATCACA CTGCCTTCTC CAATCGTCAC 
CCAAGTATTA CTGCCAATAT GAATGGGTCC 
GAAATTAAGT GGATGTGTCG CTGTGTAGAA 

55 



AAGTCCTGAA ATGATTGTTG GTTCTACAAA 3720 

TTGTTCTAAT TTACCTTCTT CTTTACCAAT 3780 

TTTTTTATTA ATAACTGGGC CCATATACGT 3840 

TTTTGTTAAT TTGATTGATT TCTCTAATAC 3 900 

ACGTGAACAT GCTGAACATT TTTGACCAGA 3960 

TGCTGCCATA TCTGTATCAA TATTTTCATC 4020 

AGCGATAACA CGTTT CAAGA AGTTTTGACC 4080 

TCTAGTACCT GTCGCACGTG ATCCTGTAAA 4140 

TAAGTAATCA CCAATTTCTT TCGGATCACC 4200 

TCCTGCTTCT TCTAAAATTT CCATTAATTT 4260 

TTTCAATAAC ACTGTATTAC CTGCCACAAC 4320 

AAACGGGAAG TTCCACGGCG GAATTGTAAC 43 80 

ATTGTGTTCA CCTTCACGAT CAAGTACTGG 4440 

TGCATAGTAT TCAATAAAAT CAATACCTTC 4500 

CTTACCTGCT TCATAAACCA TAATTGCTGC 4560 

CGAAACACGT AACATAAGCT CTGCACGATC 4 620 

AGCTT CGTTT GCTG CTTTAA ACGCATCTTC 4680 

TGCAATCACT TGTGATGTGT CTGCAGGATT 4740 

CTTCTCTCCA TTAATCACTA ATGGTATGTC 4800 

TGCTTTCTTA AACATATCCA CATTTTCTTG 4860 

AAATTCTACT ACCATGTACA CTTACCCCCT 4 920 

TTTAATGATA TAACATCATT TAAACTCATT 4 980 

CTTTCATTTT TATGTTTTAT CACTTATTCT 5040 

TAAAATTTAT ATGGGTCGCA ACTGCTACTT 5100 

TAGGTCATTA TCAATTTTAC GAACGACTTT 5160 

CGGAATATCT TTAGTGACAA CACTACCAGC 5220 

CCCTGGTAAC ACGGCTACAT GACCGCCAAA 5280 

GGCTTT TT CA AAACCTTCAT TTCTATGATG 5340 

TCCACAATTA GGTCCTATAA AAACATTATC 54 00 
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TCCTAGTTTA ACGTTCCAAC CATAATCTGT ATCAAAAGGA ATCGAAATAC TTACATTGTC 5520 

TGTTGTTGTT TGAAATAATT GATCAATTAA TTCCTTTCTT TTATTTGTAG CACTCGGTCT 5580 

5 

TGTATGATTT AATTCAAAGC AAATATCTTT CGCTCGTGCA CGTTCATTGA TTAAGTATTG 564 0 

ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG TCATTACACC 5700 

TTTCAACTCC TAATAACTTA TTTACTTGTT TAAAAGTTAA TCAAATAAAC CTTCGCCTAT 5760 

GCAACTAATA CGCTATAACA TTATGAAATC ATGACCTTAT CACCCTTATC TATACAATTC 5820 

TCGCATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA TTCATTTTTT 5880 

15 AATCTATTAA CGTACAATGT GAGTAAGAGA AATATAAAGG AGTATGATAG CGATGAGAAT 5940 

ATTAATTACA GGCACAGTTG CTATCTTAAT CATTCTAGGT TTGGTCAAAA CGATACAAGA 6000 

TTACGAAATG ACAAACGACA CGAGTCGTcA GTTGTCAGAC AACAAAGATG ATGATAAAGT 6060 

20 

CATCCATCTT AATAATTTTA AAAATTTACA TGCGAAAGAA TTTAACCCAT CTGATTTCTT 6120 

TTAAGTCACC TAAGAATTGC AAATCCAGAA GTCATTTAAG TTTTACCTTT CATTCATACA 6180 

TCCTTTAATA TTAATTACGA CTTCTTTTAT ATAGATGCTA AGTAGAGAGA TTGTTGTGCA 6240 

25 

ATGTTTGCAC GGCAATCTCT CTTTTTCTTT TTAAAATTGG TAAAAGTAAA ACGCAACGAT 6300 

TGACTTATAT ACCTATAGGG GGTACATTAG ACGTGTAACA ATGAATCACA GGGAGGCAAT 6360 

30 AATGTGGCTA ATACGAAAAA AACAACATTA GATATCACTG GTATGACTTG TGCCGCATGT 6420 

TCAAATCGTA TCGAAAAGAA ACTGAATAAA CTTGATGACG TTAATGCCCA AGTGAATTTA 64 80 

ACTACAGAGA AAGCAACTGT TGAGTATAAC CCTGATCAAC ATGATGTCCA AGAATTTATT 654 0 

35 AATACGATTC AACATTTAGG TTACGGTGTC GCTGTAGAAA CTGTCGAATT AGACATTACA 6600 

GGTATGACTT GTGCTGCATG CTCAAGCCGT ATTGAAAAAG TGTTAAATAA AATGGACGGC 6660 

GTTCAAAATG CAACGGTCAA TTTAACAACA GAGCAAGCTA AAGTTGACTA TTATCCTGAA 6720 

40 

GAAACAGATG CTGATAAACT TGTCACTCGC ATTCAAAAAT TAGGTTATGA CGCGTCTATT 6780 

AAAGATAACA ATAAAGATCA AACGTCACGC AAAGCTGAAG CGCTACAACA TAAATTGATT 684 0 

AAGCTTATCA TATCAGCAGT ATTATCTTTA CCACTATTAA TGTTAATGTT TGTACATCTT 6900 

45 

TTCAATATGC ATATACCAGC ACTATTTACG AATCCATGGT TCCAATTTAT TTTAGCTACA 6960 

CCTGTACAAT TTATTATTGG ATGGCAATTT TATGTAGGTG CTTATAAAAA CTTAAGAAAT 7020 

SO GGTGGCGCCA ATATGGATGT ACTTGTTGCT GTTGGTACAA GTGCAGCATA TTTTTACAGT 7080 

ATTTATGAAA TGGTTCGTTG GCTAAATGGC TCAACAACGC AACCGCATTT ATACTTTGAA 714 0 

ACAAGCGCCG TACTAATTAC CTTAATCTTA TTCGGTAAGT ATTTAGAAGC TAGAGCGAAG 7200 

55 
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TTAAAAGATG GTAATGAAGT GATGATTCCT CTAAATGAAG TACATGTTGG AGATACACTT 7320 

ATCGTTAAAC CAGGTGAAAA GATACCTGTT GATGGCAAAA TTATTAAAGG TATGACTGCC 73 80 

ATCGACGAAT CTATGTTAAC AGGTGAATCT ATCCCTGTTG AGAAGAATGT TGATGATACT 7440 

GTAATTGGTT CAACGATGAA CAAAAACGGT ACTATTACTA TGACAGCAAC AAAAGTTGGC 7500 

GGGGACACTG CGTTGGCAAA TATTATTAAA GTTGTCGAAG AAGCTCAAAG TTCTAAAGCG 7560 

CCGATTCAAC GATTGGCAGA TATTATTTCT GGTTATTTCG TTCCTATCGT TGTTGGTATC 7620 

GCACTATTAA CATTTATCGT GTGGATTACT TTAGTTACAC CAGGTACATT TGAACCTGCA 7680 

15 CTTGTTGCGA GTATTTCCGT TCTCGTCATT GCTTGTCCAT GCGCATTGGG ACTTGCTACA 774 0 

CCAACTTCTA TTATGGTAGG TACTGGTCGC GCTGCTGaAA ATGGTATTTT ATTTAAAGGT 7800 

GGCGAGTTTG TTGAACGCAC ACATCAAATT GATACCATCG TTTTAGATAA GACGGGTACC 7860 

20 ATTACAAATG GTCGTCCAGT CGTGACAGAT TATCATGGTG ACAATCAAAC GCTACAACTA 7920 

CTTGCTACTG CTGAAAAAGA TTCTGAACAC CCATTGGCAG AAGCCATTGT CAATTATGCA 7980 

AAAGAAAAGC AATTAATATT AACTGAGACA ACAACATTTA AAGCAGTACC TGGCCATGGT 8040 

ATTGAAGCAA CGATTGATCA TCACCATATA TTGGTTGGTA ACCGTAAATT AATGGCTGAC 8100 

AATGATATTA GCTTGCCTAA GCATATTTCT GATGATTTAA CACATTATGA ACGAGATGGT 8160 

AAAACTGCTA TGCTCATTGC TGTTAATTAT TCATTAACTG GTATCATCGC AGTGGCAGAT 822 0 

ACTGTCAAAG ATCATGCCAA AGATGCTATA AAACAATTGC ATGATATGGG CATTGAAGTT 8280 

GCCATGTTAA CTGGCGATAA TAAAAACACT GCTCAAGCCA TTGCAAAACA AGTAGGCATA 8340 

35 GATACTGTTA TTGCAGATAT TTTACCAGAA GAAAAAGCTG CACAAATTGC GAAACTACAG 84 00 

CAACAAGGTA AGAAGGTTGC GATGGTTGGT GACGGTGTAA ATGATGCACC TGCATTAGTT 8460 

AAAGCTGATA TCGGTATCGC CATTGGTACA GGTACAGAAG TTGCCATTGA AGCAGCTGAT 8520 

ATTACTATTC TTGGTGGCGA CTTGATGCTT ATTCCTAAAG CCATTTATGC AAGTAAAGCA 8580 

ACCATTCGTA ATATTCGTCA AAATCTATTT TGGGCATTCG GCTATAATAT TGCCGGTATC 8640 

CCTATAGCTG CATTGGGCTT ACTTGCGCCA TGGGTTGCTG GTGCTGCAAT GGCACTAAGT 8700 

TCAGTAAGTG TTGTCACAAA CGCACTTAGA TTGAAAAAGA TGCGATTAGA ACCACGCCGT 8760 

AAAGATGCCT AGATTCCTTA ATAATGAAGG ATTCGTTGGT GATTCTGAGA TAGGCTAGTG 8820 

50 ATTGGCTCTA TAATGTCGCG GTTTAyaGTt GGATCTTCGC TCCAACTGCA TATATAGTnA 8880 

CACTTTTCGC TTGGCGAATT AGTGTATCTT ACCTAATAGc TCCGCCTATT AGGTTCCATC 8940 

ATTATTATAA ATAATAAGTA CACTACGGtT TACAGTTGGA TCTTCGCTCC AACTGCATAA 9000 

55 



25 



30 



40 



45 
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GAAATTTTAA ATGTTGAAGG TATGAGCTGT GGTCACTGCA AAAGTGCTGT TGAATCTGCA 9120 

TTAAATAATA TTGACGGTGT CACTTCAGCT GACGTTAACC TTGAAAATGG TCAAGTAAGT 9180 

5 

GTTCAATATG ATGACAGTAA AGTTGCTGTA TCTCAAATGA AAGACGCAAT TGAAGATCAA " 9240 

GGTTACGATG TCGTTTAATT AGGCAATATT CAACGTCATC AACACCAAAT TAAAAAATCG 9300 

AACTGATGAG AATCCCAACA ATCCAAATTA TCTCATCAGT TCGATTTTTA ATTTACTCGT 9360 

10 

AACCTAGTAT CTCCAGTCTG CAATACATCT AATGTTGCAT CTAATGCATC GACAATTAGA 9420 

. TTTTTAACTG CAGCTTCAGT ATAAAACGCA ATATGTGGTG TTAATATGAC ATCTTCCCTG 94 80 

75 TCAATCAACG ATTCTAACAA TGGATCGTTC AGTGTTTTGC CCCTTTGATC ACTTGGGAAA 9540 

AGTTTGCGTT CAAATTCATA CGTATCAAGT GCTGCACCTT TAATCACACC ATTGTCTAAT 9600 

GCGTCTAATA ACGCCTTAGT ATCTACTAAA GAACCTCTCG CACAATTGAC AAATACTGCG 9660 

20 CCCTTTTTAA AATGTTTAAA TAATTCAGCA TTAAATAGAT AATGATTATA TTTCGTTGCA 9720 

GGTACATGTA ATGTCACGAT ATCAGCACCT TCAACCGCTT CCTCAATCGT ATCTTTGTAA 9780 

TCGACATACG TTGCAATTTT AGCATTAGGA AACGGtCGTA TGCGACCACA TCACTTTGAT 984 0 

25 

AACCATTGGC AAATATATCG GCTACTACAC GGCCAATTCG ACCTGTACCA ATAACAGCTA 9900 

CTTTTAAATC TTTAATGGAT TTCGATAAAA TAGTAGGTTC CCATCTAAAA TCATGcTCCC 9960 

GCACTTTCGT TTGAATTTGA TTAAAATGAC GAACCACATT AATAGCCTGG TTCACAGCAA 10020 

30 

ACTCCGCAAT TGAATTCGGA GAGTATGACG GCACATTTGA CACAATAAAG TTATACTTGT 10080 

TTGCTAACTC CAAATCATAT GTATCAAATC CAGCACTACG TTGTGCGATT TGTTTAATAC 1014 0 

35 CTAGTTCATT TAATCGTTTA TAAACATGCT CTGATAATGG TATTTGTTGT GATAGCGATA 10200 

AGCCATCATA ACCAGCGACA CCTTCAACAT TGTCATCAGT TAATGCTTCT TTAGTAATAT 10260 

CTACCTCAAC ATGATGTTTC TCTGCCCACG CCTTGATATA AGGCATATCT TCATCACGTA 10320 

40 • 

CACTCATGAT TTTAATTTTT GTCATTTTAA CATCACCCTT AACTTTATTA TTCATATAAA 10380 

TATGCTAGTT CTGTTAATCT TATTGCAGCT TCGTCTAATT TCTGGTCATC TAACGCCAAT 10440 

GAAATTCTCA CATAACGATT ACCATTCTCT CCAAATGGTT TCCCTGGAGC AACAAGTATT 10500 

45 

GACTTCTCTT GCACTAAAAA TTGCTCAAAT TGCTCGCTGT CATAACCAGG CGGTGTTTCC 10560 

AACCATACAT ATATGCCACC TTTAGCATGA ACAAATGGCA AATCAGCTTT TGCAAGCATG 10620 

so GCTTCGAATC GGTCACGACG TGTTTTAAAT ACATTGCTTT GTTCTTCTAA AAAATCATCA 10680 

TAATGATTCA AAGCATATAT TGCGGCATCT TGTAATGCAC CAAACATCCC AGCATTTGTG 10740 

TGCGTTTGGT ACTTTTTCAA AGCTTGAATC ATATCTTTAT TACCAACTGC AAAACCGACT 10800 
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CCATTTTCCG 


AAGCAAGTAT ACTAGGATTT TTAGCGTCGA 


AAGCGAAAGC 


ACCATAAGCA 


10920 




AAATCATGCA 


CGATTTTAGT GTCTGTACCT TTAAATTTAG 


CTATCGCTTC 


ATCAAAAACT 


10980 




TCTCTCGTAG* 


CTGTCGATCC AGTTGGATTA TTTGGATACG 


TTAAATAAAT 


GAGTTTTGTT 


11040 




TTATCTATTA 


TTTGTGAATC AACTTTGGAC CAATCTGGCA AATAATGTGG 


CGGTTCTAAA 


11100 


10 


TTAAGCGGGA 


CTGGCTTGCC ATCAGCTAAA AGTACACCTG 


CTAAATAATC 


CGTGTAGCCT 


11160 


GGATCAGGTA 


GTAATACATA GTCTCCTGGA TTGATAACAC ATGTTGGTAC TGCCACTAAT 


11220 




CCA'ITTTTTG 


TACCATATAA AATGCATACT TCATCTTCTT 


TATCTAACGT 


CACATTATAT 


11280 


1S 


TGTCTTTGAT 


AAAAATCTAC AATAGCTTGC TTGAACGCTT CTTTACCATG AAAAGCACCA 


11340 




TATTTTTGAT 


TTTCAGGAAT AGTTAGTGCT TTTTGAAAAT 


GATCAATAAT 


ACCTTGTGGC 


11400 




GTGGGCCCAT 


CAGGGATTCC AACTGCCATA TTAATTAATG GCAATGGTCC ATGTTCGATT 


11460 


20 


TTACGTCCCA 


TCGTTTTCCC GAAATAACTA TCAGGGATAT TTGCTAATTT GTTAGAGATC 


11520 




ATCAAATTCC 


TCCTCTATCA TTAAACATAG CCTGGGCGAC 


TATCATAATC 


CTAACAACTT 


11580 




GTATCACTCT CATTTAGATG GTTACAATGA CATCGCCATT 


CACCGTTATG 


TTCAACAGAA 


11640 


25 


CTTATGACAC 


ACGTTGTATT GAATGAATTT ATTTTCATTT 


TAGGTAGGTA 


TAATATTATT 


11700 




GTCAATATTA 


GGAATTTTCA GATTAATATG CACTCAATCG 


TTATGATTTA 


ACTGTCATGC 


11760 


30 


ATATCCGCAT 


GCGCAACCAG TTAGATATGC TTATATAAAG 


TATAACGCCC 


ATCAAGGTAC 


11820 




GTATTCAAAC 


GTGAACCTTA ACAGGCGTCA TTCATTGTTA 


AATAAAACTT 


CTTAAGCACA 


11880 




TACTTATTTC 


ACTATGCCTT TTACGTTCCC CTTATACTTT 


TCTCACATCT 


TTCTCTTAGA 


11940 


35 


CTACTCCCTT 


ATACGCCCCG CTCAATATCT TTAATCATTT 


CATCTACAGT 


TATTTTCGCA 


12000 




CTCGTTAAGA 


CAATAGGAAC GCCTGCACCT GGATGCGTAC 


TTGCACCTGC 


AAAATATAAA 


12060 




TCTTTATAAT 


CTCGCGATAC ATTTTGTGGA CGATAATAAT 


TACTTTGCGC 


TAAAGTTGGC 


12120 


40 


ATTAAACCGA 


ATGCCGAACC AAATTTCGCA TGATACGTTT GCTCAAAATC 


ATTTGGCGTA 


12180 




AAGATTGTTT 


CTGAAACAAT ATGCGATTTT ATATCTTCAA ATACTTCAAT CGTTGCTAAT 


12240 


45 


TTACGATAAA 


TAATTTCCTT TATTTGTTGC GTCAAAGCTT 


CATCTGACCA 


ATCGATTCCG 


12300 


CTACCTGTTT 


TAAGTTCCGG CGTCGGCATT AGCACATAAA 


TACCAGTTTT 


GCCTTCTGGC 


12360 




GCAAGTGATT 


TATCAGCGAC CGCTGGTACA TACACATAAA 


TAGAAGGATC 


ATATGATAAA 


12420 


50 


CGTCCCTCAA 


ATATTTCTTC AATATTGCCT CTAAAGTCAT 


CTGAAAAAAT 


AACATTATGA 


12480 




AGTCTCACTT 


GATCTGTCAC ATCAATATCT ATACCGATAT 


ACATTAAAAA 


TGCTGAACAA 


12540 




GAGTAATCTA AGTCTGCAAT TTTATGTGGT GGATACTTTT 


TAATAGGTGC 


AAAATCTGGC 


12600 
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10 



15 



20 



25 



30 



35 



40 



45 



ATGTCACCAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12720 

TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGcCCT 12760 

TGAGCCATGC CATACATACC GCCTTTAATA AAATGCACAC CAAACATCAT TTCAATCATA *12840 

GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12900 

AACGCTAAAA GCTTTTGTAT CTTTTCGTTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12960 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG T CAT ATT AT A AAAGTCACTC 13020 

GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACGTGCAA TTTCATATTT TTTATAAACA 13080 

TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 13140 

TGTAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13200 

TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACGCTGTA 13260 

AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACCCATATC AAATGTAAAG 13320 

CCGTCTTTCT TTAATTGATT CATACGCCCG CCTACATTAT TATTTTTTTC AAATATCGTC 13380 

ACTTCATGAC CTTGAGAAGC AATACGGGCT GCCGCTGCTA ATCCTGTGAC ACCTGCACCA 13440 

ATTACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTATTTCAT 13500 

GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13560 

CCTGTCTCAC TTCGTCCAGT ATTTCAATAT ATATACGTGC TGCTAATTCT ATGATTGGTT 13620 

GTGCTTCAAT ACTAAATACT TTGATTTGAT CCATAACATC TTGAAAATCT TTTTCTGCGA 13680 

TAGCTGCATA ATATTCCCAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13740 

CAGCAATATC AACTTCATAT TGCTTTAATC GTTGCTTACT AAAATATATC CGTTCATTGT 13800 

CAAAATCTTC ACCGACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG 13856 

(2) 5NF0RMATI0N FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10088 base pairs 
(6) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



so 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 60 

AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AATAGCTGTA ATAGAATACT 120 

AAATGTGACA AACTTAGAAC TAATATCAAG TGTTGATGTT TTGAATATAA AAATGCTAAT 180 
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ATAATTGGTT AATATATGAG TAATTAGAAA 
AATATGAAAG ATTATGGGTT AACAGGCATA 
5 CGTGCGTTAA ATCGTGGAAG ATGTAAACCA 

GATATTTGCA AACCATTAAC GATATATGGC 
ATTTTACGCC GATGTCATTC TGGTCCTTTA 

10 

CGTGGTTATA ATGGACACAG TCATATTCAT 
GTATCGTATC CTTATAACAA TACAGCTATG 

15 ATAGGTGTGA CCATTAAGAA TGTAGTGAGT 

GGACTCTATA TTAAAAGCTG TTCATTTGAA 
ATTCTGAAGC AATACAATTA GACATTCAAG 

20 CAGATGGTAC GATAACGAAA AATGTCATTA 

TGCCCGAAAT GGGAAGTTGG AATCGTGCTA 
ACTATGAGAA T ATT CAT ATT AGAAATAATA 

25 

CTCCCTTGaA GTATAAAGAT GCTTTCATTA 
GCATTAGATA TTTAGGAGTT AGAGATGGTA 
ACTTAGGTTC CCAAGCAGGC ATAAATATGA 

30 

TGTCTAAAGA TGCGATACAT GTACGTAATT 
TCGTTGGGAA TACATTCAAT AATTCGACTC 

3$ TGTTTTTAAG TCCTGTTGAA GCGGGTATTC 

AAAAGTAAAA AGTTTCGCAT GACATTAGGA 
ATT6ATAAAA CGGTATAAAT ATGCTATAAT 

40 TGACGGTAAT GATAATACAA GATAGACAAC 

AGCTTGTCAT AATCATCATG AGGGGGAAAT 
TGATATCGAA AAGGTATTTA ACATTCTTTT 

45 

CGGGACAACT TATTGGACTA ATATTAGGTC 
TTCATCCACA AGACTTACCT TGGAAAGGCG 
SQ CGACTTGGTG GATTACTGAA GCAATTCCTA 

TATTACCATT AGGTCATATA CTTACACCAG 
TTATCTTTTT GTTTTTAGGT GGATTTATTT 

55 
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ATAGACAAAG GATGACGATT TATGTATATC 300 

AACAAAACTA AAGATACTCG AGCAATACAA 360 

ACGACAGTTT ATATACCGAA AGGGACGTAT 420 

AATACAACAC TTTTGTTAGA TAATGAAACT 4 80 
TTAAAAAATG GTCGTCGCTT TGGTTTTTaT 54 0 

ATTAAAGGCG GCAAGTTTGA TATGAATGGT 600 
TGCATTGGGC ATGCTGAAGA TATTCAATTA 660 
GGTCATGCAA TTGATGCTTG TGGGATTAAC 72 0 

GGATTCATAG ACTATAGTGG CGAACcTTTT 780 
TACCTGGTGC TTTTCCAAAA TTCGGAACgA 840 
TCGAAGATTG TTATTTTGGA CCTTCAGAAT 900 
TTGGCTCACA TGCAAGTAGA CATAATCGAT 960 

TATTTGAAGA TATACAAGGT TATGCATTAA 1020 

TTAATAATAA GTTTATTAAC TGTGaGGGTG 1080 

AAAATGCAGC AGATGTGaTG ACAGGaAAAG 1140 

ATATAATTGG AAATGAATTT AAAGGATCAA 1200 

ATAATAATGT TAAACATAAA GATGTATTAA 1260 

AATCAATTCA TTTAGAAGAT ATTGATACAG 13 20 

AAGTTACTAC AATCAATGTA GATGAAATAA 1380 

TTAAGAATAG TAGATAATTT TTGAAAGCGC 1440 

AAACCCAATT ATCTGATAAA AGGGGTATTT 1500 

TTTCTATACT CTAATATAGT GAGTTGAAGT 1560 

TTATGGCTTA TTTCAATCAA CATCAATCAA 1620 

CAAAATCAAA GAAAAAGAAA CCGTTTAGTG 1680 

CATTACTTTT CCTATTAACA TTATTATTCT 1740 

TCTATGTTTT AGCGATTACT TTATGGATTG 1800 

TTGCAGCAAC GAGCTTATTA CCAATTGTGT 1860 

AACAAGTATC ATCCGAATAT GGCAATGATA 1920 

TGGCAATTGC AATGGAAAGA TGGAATTTAC 1980 
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TTGGATTCAT GGTGGCAACA GGATTCTTAT CTATGTTTGT ATCGAACACT GCAGCTGTAA 2100 

TGATTATGAT TCCGATTGGT TTAGCAATTA TTAAGGAAGC ACATGATTTA CAAGAAGCCA 2160 

ATACGAATCA AACAAGTATT CAAAAGTTTG AAAAATCTCT AGTTTTAGCA ATTGGCTATG 2220 

CAGGTACGAT TGGTGGCTTG GGTACATTAA TCGGAACCCC GCCATTAATT ATTTTAAAAG 2280 

GACAATACAT GCAACATTTT GGACATGAAA TTAGTTTTGC TAAATGGATG ATTGTAGGGA 2340 

TTCCAACGGT CATTGTTTTG TTAGGTATTA CTTGGCTCTA TTTAAGATAT GTTGCGTTTA 2400 

GACATGATTT GAAATATTTa CCTGGTGGTC AGACGTTAAT TAAACAAAAG TTAGACGAGC 2460 

TTGGCAAAAT GAAGTATGAA GAAAAGGTAG TACAAACTAT CTTTGTACTT GCTAGCTTAT 2520 

TATGGATTAC AAGAGAGTTT CTTCTGAAAA AATGGGAAGT TACGTCATCT GTTGCAGATG 2580 

GTACGATTGC TATTTTTATA TCAATATTAT TATTTATTAT TCCAGCTAAA AATACTGAAA 2640 

20 AACATCGCCG TATCATTGAC TGGGAAGTTG CAAAAGAGCT CCCTTGGGGT GTATTAATTT 2700 

TATTTGGTGG CGGTTTAGCA TTAGCGAAAG GTATTTCTGA AAGTGGTTTA GCAAAATGGT 2760 

TAGGCGAACA GTTGAAATCA TTAAATGGTG TTAGTCCGAT TCTTATTGTA ATTGTCATAA 2820 

CAATCTTTGT CTTATTTTTA ACTGAAGTGA CATCTAATAC TGCAACTGCA ACGATGATTT 2880 

TACCGATTTT AGCAACGTTG TCTGTTGCTG TTGGAGTGCA TCCATTACTA CTTATGGCAC 294 0 

CTGCAGCTAT GGCGGCTAAC TGTGCATACA TGTTACCAGT AGGGACACCA CCGAATGCAA 3000 

TTATCTTTGG TTCTGGTAAA ATATCTATCA AACAAATGGC ATCAGTAGGA TTCTGGGTAA 3060 

ACTTAATCAG TGCAATAATT ATTATTTTAG TCGTGTATTA TGTAATGCCT ATAGTTTTAG 3120 

3S GTATTGATAT AAATCAACCA CTGCCATTGA AATAGTAATT GCAGATTAGA ACGAAAAATA 3180 

AAAGGTTACA TTAGCAATTG CTTGGACGAG TGGTAACGAA ACGTATACCG CAGCATCGTG 324 0 

TAASAACAAT ACAAACAAAA GAAAGTCAAC CAAGGATGGA TTCCTATTTT AATCCTTGGT 3300 

40 TGACTCTTTA TTTTATTTAA ATTGTAGAAC CTAGAAAATA AAGTTTAATT AAAAGCACCA 3360 

ATCATTTCTA CTTTGAAATC TAAGGTTTCT AAAATAGCAA TGACTTTCTT TATATCGGTT 3420 

GTAATTGCAG AATCAGCCTG AACGAAAAAT CGATACATAC CTAATTGTGT TTTTAAAGGA 34 BO 

CGAGACTCAA TCCAGGATAA ATTAATATTA AACAAAGCAA ATGTATTAAG CACACTTGCT 3540 

AACAACCCAG GTTTATCATG CATTGGTGTA ATTAAAAACA TCAATGATGT CGCATTTTGA 3600 

TCAAATTGCT GCTGATTTTT TATAACTAAA AAACGTGTCA CGTTATGTGG ATAGTCTTCA 3660 

ATATGTGTAT CAATAGGTGT AAAACCATAA GctTCGCCAC TACCTAAAGG TGCAATTGCT 3720 

GCAACGCCAT TTTCAATTTT AGTCAAACTT TGAATTGTAC TGTCGACATA ATCATAGTCA 3780 
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TTTTTAATAT CAGAAATGGA ATCTGTTCCA TTACCATATA ATGCAAAGTT AATATCTAAA 3 900 

CGTATTTCAC CGTGTGCAAA GACATCTTGC TGTGCAAGTG CATCTGCCAC AATGTTGATT 3960 

5 

GTTCCTTCTA TAGAATTTTC AATAGGGACA ACACCAATCG ATGTGTCATC ATCTGCAACT 4020 

GCCTTGATGA CTTCAAATAA ATTTGACTTT GGTTGAAAAG TTGCTTCATT TTCAGAAAAA 4080 

TACTGACGAC AAGCCAAATA TGAAAATGTA CCTTTAGGGC CTAAATAATA TAATTGCATA 4140 

10 

TGCTACACCT CTACTAACTT AATGATGGAA AGGGCACTGG TTAGCATTTG ATTCTTTCTT 4200 

TTTATAGAAA AAGTTTGGAT CTTTTACTGT ATTGTCATAT CCGTGATGAT AATTTGACGT 4260 

15 CAATGTTGGA GATAATGGCG GTGCTAGCCA AGACCATTTT CCGGTAACTT GACGACCTTG 4320 

TTGTGCTTCG TTACGTTCGA ATAGTTCGAA TTGCTTTGCA GCGGTCAAAT GATCGACAAT 4380 

TGATACGCCT TCTTTTTTAA AGGAATGATA CACAGCATAG TTCAATTCAA CAAGTGCTCG 4440 

20 

ATCTTTATTA AATGAATTAT TTTTAAGTGT ATCAAATTCA AACGCATCTG CAACTTTTTC 4 500 

TAGTAAATTG TAACGGTAAT CATCAATAAA GTTACGTACG CCAATTTCAG TTACCATATA 4560 

CCAACCGTTA AAGGGTGCAG TTGGATATAC AATGCCACCG ATTTTTAAGT CCATATTGGA 4620 

25 

AATGATAGGG ACTGCATACC ATTTTAAGTT CAATTTTCTT AATTTTGGAT AATGATTATG 4680 

TTCAATAGGT ACTTCTTTAA TTAATGAAGT AGGATATTCG TAAAATTTAA CTGACTCATT 4740 

3Q AGGTAATTGG TAAATCAGTG GTAACACGTC AAAATTAGTA CCTTTTCCTT TCCAACCTAA 4800 

GTGATTTGCT AAGCGTGTAA CTTCTTTTTC AGCAGGATCA CCACAATTGT CATAGCCAGC 4 860 

ATAGCGAATT AATTGATTGT TGAAAATTTT AGGTCCATCC TTTGGAGCAT ATATAGTAAT 4 920 

35 ATACGGCTTT AATTTACCTT CATTTGTAGC CTGTGTAATA TGATAAGTAA TTGATGATAA 4 980 

GAACGATGCT TCGTCAGTAA CATCTCTTGC ATCAATGACA TTTAACGAAT CCCAAAATAA 5040 

ACGACCAATG CAACGATTTG AATTACGCCA AGCCATTTTA GCACCATAAA TAAGTTCTTC 5100 

40 

TTCTGTATGT GTATATGTCC CAGTTTCTTT TATTTCTAGT TCAATGTCAT GTAAACGTTT 5160 

ATTGATAATT TGCGTTTCAT AATGACACTC TTTATACATG TTTTCTATGA AAGCTTGAGC 5220 

CTCTTTAAAT AACATTAACA ACACCTCGCT TTATATTATA GTCTACATTA TTAAAATACT 5280 

45 

CTTAAAAATT ATGTATATGT CATTAAATTG TTGGTTGATT TTAATTAAAA GTATGGAAAT 5340 

TAAGGGGCTC TTATGTATAT AAAAAAATGA ATTATGATAA AATGTAAGAA AATATTTAGG 5400 

50 TCGATTGGAG AGATACAAGT GTACCAATTA GAAGACGACA GTTTAATGTT ACATAATGAC 5460 

TTATATCAAA TAAATATGGC TGAAAGTTAT TGGAATGATA ATATTCATGA AAAAATGGCT 5520 

GTATTTGATT TGTATTTTAG AAAAATGCCA TTTAATAGTG GCTATGCTGT TTTTAATGGT 5580 
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TTAAAGTCTA 


TTGGCTACAA 


GGATGATTTC 


TTATCATATT TAAAAGATTT AAAATTCACA 


5700 


£ 


GGCAGCATCC GTTCGATGCA AGAAGGCGAA TTATGCTTTG GTAACGAACC ATTGTTACGC 


5760 




GTAGAAGCAC 


CATTGATTCA 


AGCGCAATTA 


ATAGAAACAA 


TTTTATTAAA 


CATTGTAAAT 


5820 




TTCCATACAT 


TAATTACAAC 


AAAGGCTAGC 


AGAATTCGTC 


AAATTGCATC 


AAATGATAAA 


5880 


10 


TTAATGGAGT 


TTGGTACACG 


TCGTGCGCAA 


GAAATTGATG 


CAGCATTGTG 


GGGCGCTAGA 


5940 




GCTGCTTACA 


TCGGGGGCTT 


TGATTCTACA 


AGTAATGTTA 


GGGCGGGGAA ATTATTTGGT 


6000 




ATACCTGTGT 


CTGGTACACA 


TGCACATGCA 


TTTGTCCAAA 


CTTATGGAGA 


CGAATATGTT 


6060 


IS 


GCCTTCAAAA AATATGCTGA AAGACATAAA AATTGTGTGT TCCTAGTAGA TACATTCCAT 


6120 




ACTTTAAAAT CTGGCGTGCC AAATGCAATA AAAGTTGCAA AAGAATTAGG TGACAAAATT 


6180 




AACTTTGTAG 


GTATTCGATT 


AGATTCTGGA 


GATATCGCTT 


ATTTATCTAA 


AGAGGCAAGA 


6240 


20 


CGTATGCTTG 


ATGAAGCAGG 


ATTTACTGAA 


ACTAAAATTA 


TCGCGTCTAA 


TGATTTGGAT 


6300 




GAAGAAACGA 


TTACGAGTTT 


GAAAGCACAA 


GGTGCAAAAG 


TAGATTCTTG 


GGGCGTTGGT 


6360 


25 


ACAAAGCTGA 


TTACAGGATA 


CGATCAACCA 


GCATTAGGTG 


CAGTATATAA 


ACTTGTAGCT 


6420 




ATTGAAAATG 


AAGATGGTTC 


ATATAGTGAT 


CGTATTAAAT 


TATCAAATAA 


CGCTGAAAAG 


6480 




GTTACGACGC 


CAGGTAAGAA 


AAATGTATAT 


CGCATTATAA ACAAGAAAAC 


AGGTAAGGCA 


6540 


30 


GAAGGCGATT 


ATATTACTTT 


GGAAAATGAA 


AATCCATACG 


ATGAACAACC 


TTTAAAATTA 


6600 




TTCCATCCAG 


TGCATACTTA 


TAAAATGAAA 


TTTATAAAAT 


CTTTCGAAGC 


CATTGATTTG 


6660 




CATCATAATA 


TTTATGAAAA 


TGGTAAATTA 


GTATATCAAA 


TGCCAACAGA 


AGATGAATCA 


6720 


35 


CGTGAATATT 


TAGCACTAGG 


ATTACAATCT 


ATTTGGGATG 


AAAATAAGCG 


TTTCCTGAAT 


6780 




CCACAAGAAT 


ATCCAGTCGA 


TTTAAGCAAG 


GCATGTTGGG 


ATAATAAACA 


TAAACGTATT 


6640 


40 


TTTGAAGTTG 


CGGAACACGT 


TAAGGAGATG 


GAAGAAGATA 


ATGAGTAAAT 


TACAAGACGT 


6900 


TATTGTACAA 


GAAATGAAAG 


TGAAAAAGCG 


TATCGATAGT 


GCTGAAGAAA 


TTATGGAATT 


6960 




AAAGCAATTT 


ATAAAAAATT 


ATGTACAATC 


ACATTCATTT 


ATAAAATCTT 


TAGTGTTAGG 


7020 


45 


TATTTCAGGA 


GGACAGGATT 


CTACATTAGT 


TGGAAAACTA 


GTACAAATGT 


CTGTTAACGA 


7080 




ATTACGTGAA 


GAAGGCATTG 


ATTGTACGTT 


TATTGCAGTT AAATTACCTT 


ATGGAGTTCA 


7140 




AAAAGATGCT 


GATGAAGTTG 


AGCAAGCTTT 


GCGATTCATT . 


GAACCAGATG 


AAATAGTAAC 


7200 


50 


AGTCAATATT 


AAGCCTGCAG 


TTGATCAAAG 


TGTGCAATCA 


TTAAAAGAAG 


CCGGTATTGT 


7260 




TCTTACAGAT 


TTCCAAAAAG 


GAAATGAAAA 


AGCGCGTGAA 


CGTATGAAAG 


TACAATTTTC 


7320 




AATTGCTTCA AACCGACAAG 


GTATTGTAGT 


AGGAACAGAT 


CATTCAGCTG 


AAAATATAAC 


7380 
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TAAACGACAA GGTCGTCAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT TATATGAAAA 7500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7560 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 7620 

AAAAGTAATT GAAAATCATT ATATACGAAA TGCACACAAA CGTGAACTTG CATATACAAG 7680 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 7740 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7800 

AAGTGTATGA ACTAAAGTAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 7860 

1S CGCTATTTAT TTATGAGAGT TTTCGTACTA TATTATATTA ATATGCATTC ATTAAGGTTA 7920 

GGTTGAAGCA GTTTGGTATT TAAAGTGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 7980 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 8040 

20 AACGTTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCGTTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 8160 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 8220 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 8280 

ATATGAGTTA GATTTACCGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CGCACTATGC 834 0 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 8400 

ACGTAAATTG ATGGATACGA TAAAAAATTT AGATCCGAAA GCATTTATCA TTGCGTATGA 84 60 

ACCTCGAAAC ATACATGGTG GATTCTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 8520 

35 TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaCATGAA aTTCmAAGTA AaTGAGAaTG 8580 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG TAAAACGTAT 864 0 

TGAAAAACCC GTGTTTCAAG AGCAAAAAGA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 8700 

CGTfTTTGTA GGTAAGAAAA TCCAATAACA TAATCCAATT TAAATAAAGA CTATTTGAAG 8760 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8820 

CAAGATGCTC GAT C CAT ATT GTATGAGAAA CCCCCAGCAA GCTATATAAA GCATATGCTG 8880 

GGGGTTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8940 

CAATCTCATC GCATGAAATT TTTTCTTCCT AGAGACCTTT AATAAGATTA ATAGTTTACT 9000 

TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 9060 

CCAAAGCTTC TGTGTTCATA ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 912 0 

CAATAACAAT TGAGCTATTA TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 9180 
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5 



10 



15 



20 



CATTTAAATC 


TTGAGGATGC 


CATTCTCCCT 


CAATAATATT 


AAGATAATAC 


TTAGCCTCTG 


9300 


AATTACATTT 


GAATTTATCA ATACTAAATA ATTCAATTTG TTCCATAATA TTATTTACCT 


9360 


TTCTAAAATA 


t-AAAi i 1 1 AA 


T AAC CAT AAA 


TAGATGAATA 


CCATCGATAA 


TGGTCGCCAT 


9420 


TGGATACTGG 


AATAACATTG 


TTTTT AG CAT 


CTTGAGTCAT 


AAAACCATTA 


TCCCATGGAT 


9480 


TCCATATAAT 


TATAACCTCT 


TGTCCATTAT 


CTAATTTAGC 


GTTCCCAACA 


ACTGCCATGG 


9540 


CATGCCCTGC 


GTGCATACCA 


TTTCTTGATT 


CTACTCTACT 


ACCTAAAACA 


GCAATTCCTT 


9600 


TATTATTTTT 


AGTAAGATTG 


TCAACTTCAT 


TATATGTAGT 


CATTCTATTA AGAAGTTGTG 


9660 


GACTTCTTCC 


CTGAGTTTGT 


CCAAAATAAA 


TCATCTCTCT 


TGGCGTTAAA 


CCAGTAAATT 


9720 


GGAATCGTTG 


TCCTTGTAAG 


TTTGGGTGTA 


AAAATCTCAT 


CACAGCTTCT 


GCATGATATT 


9780 


TGTTAGTATT 


ATAAGTCGCA 


TTTAGTAATT 


CAGACATCGT 


ATAGCCTGCA 


CACCAACCAT 


9840 


TGTTACCTTG 


AGTTTCTCTT 


ATCTTGAAAT 


TCTCAAGTTT 


ATTTATATAT 


TGsTCGTTGT 


9900 


AAGTATAATT 


ATTACTTTTA AATTGACTAG TTGGCATAGT GACAGAAGCT TTTTGCTTTA 


9960 


GTTGCGTTAC 


ATTATTGCCA 


GTAGGTATAC 


TCTCAGTCTT 


TnTnAACTnT 


nTATCTTCTA 


10020 


GACGTGGTGT 


TTTTAGTACT 


AGTTTAGCTT 


TATGATTTTG 


AGTACCACAT 


AGTAACCTTT 


10080 


TGAGTTGT 












10088 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
3S (D) TOPOLOGY: linear 



T (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

40 CGGAAACGnA CCCnATGCGT ATGCTTGACG TGCCAAAATT AAATACGAAG TTCATAGCTT 60 

TGAGGTACCA GAAGAACATT TATCTGGTCA AGAAGTCGCA GnACTCATAC AAGCAAATGT 120 

TAAAACAGTA TTTAAAACGC TTGTTCTAGA AAATACAAAA CATGAACATT TTGTATTTGT 180 

45 

TATCCCAGTA AGTGAAACTT TAGATATGAA AAAGGCAGCT GCTTTGGTTG GAGAGAAGAA . 240 

ATTGCAGCTT ATGCCTTTAG ATAATTTGAA AAATGTAACG GGATACATTC GTGGTGGGTG 3 00 

TTCGCCTGTT GGTATGAAAA CATTGTTTCC AACAGTCGTT GACAAATCGT GTGAAAATTA 360 

50 

TAGTCATATC AGTGTGAGTG GTGGG CTTCG AACAATGCAA ATCACAATAG CTGTTGAGGA 420 

TTTGATTACA ATAACTAAAG GCAAAATTGG AGCAGTTATC CATGAATGAT TAATAACAAC 480 

55 
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TGCCACACTC CTTTTTGATT GAATTAGCAT 
GTATTTGAAC ATAAAAATGT AATTTTATCG 
5 GTAATTTATG ATTGAAAAGT GAAAGCGTAC 

GATGATAATT ACTGaAAAAA GACACGAGTT 
TTTGACTTTA CAAGAATTAA TAGATCGAAC 

10 

TTTATCTAAA CTACAACAAT TAGGGAAATT 
AGAAAATCGT ATGGTTGAGG CGAATTTAAC 

75 GAAAATGATT GCTAAAATAG CAGCTAATCA 
TGCTGGTTCA TCTACATTGG AGCTAATTAA 
AACCAATGGT TTAACACATG TAGAAGCTTT 

20 AGGTGGTCAA GTTAAAGAAA ATACACTTGC 
AAGACGATAT TGTTTCGATA AAGCTTTTAT 

ATTAACTACT CCCGATGAGC AAGAGGCATT 

25 

TCAATCATTT GTACTTATAG ATCATTCTAA 

TTTGCTAGAA AGTACGACAA TCATCACATC 

AGAATACCAA CAAAAGTATC ACTTTATAGG 

30 

AATCCTTCAA TTGACTATGT CATTTTTACG 
GCAACAGCAA CATATAAATT CGCTGGGGGG 

35 ACATTGGATG TTGAGTCAAC TGCCTTGGGA 
ATAGATACAT TAAATAACAG TGCAATTCAA 
CGTATTAATG TGAAATTAAA AACAGGACAA 

40 ATAACGTCAA CACAATTTGA ACAACTGTTA 
ATAGTTATTG TTGCTGGAAG TGTACCAAGT 

GCACAAATTA CAGCACAGAC AGGTGCTAAA 

45 

GAAAgCGTTT TACCATATCA TCCACTATTT 

ATGTTTAATA CAACAGTGAA CTCAGACACA 

GATAAAGGTG CGCAATCTGT TATTGTCTCG 

SO 

AAAGAAATCA GTATTAAAGC AGTTAATCCA 
GGTGATAGTA CAGTTGCAGG CATGGTGGCT 

55 



TTTACGATCA TAAACAGTCA TTATAATTGA 600 

TAACAATTTG AGTGTTTGTG ATTGTTTTTG 660 

TCATTATAAT ACAAAGTGAG ATGGGGTGAT 720 

AATATTAGAA GAACTTTCGC ACAAAGATTT 780 

TGGTTGCAGT GCTTCAACAA TACGArGAGA 840 

GCAACGTGTG CATGGTGGTG CAATGTTAAA 900 

TGAAAAATTA GCAACGAATC TTGATGAAAA 960 

AATCAACGAT AATGAATGCT TATTTATCGA 1020 

ATATATTCAA GCGAAAGATA TCATTGTGGT 1080 

ACTTAAAAAA GGTATTAAAA CAATTATGCT 1140 

TACGATTGGT TCTAGTGCTA TGGAGATATT 1200 

CGGGATGAAT GGATTAGATA TTGAACTTGG 1260 

AGTTAAACAA ACAGCAATGT CATTAGCCAA 1320 

GTTTAATAAA GTATATTTTG CTCGTGTACC 1380 

TGAAAAAGCA TTAAATCAAG AATCGTTAAA 1440 

AGGGACTTTA TGATTTATAC AGTGACTTTC 1500 

AATGATTTTA AAATTGATGG TTTGAACAGA 1560 

AAAGGTATTA ATGTCTCGCG CGTCTTAAAG 1620 

TTTGCAGGTG GATTTCCTGG GAAATTCATT 1680 

TCGAATTTTA TTGAAGTTGA TGAAGATACA 1740 

GAAACAGAAA TCAATGCACC GGGTCCTCAT 1800 

CAACAAATTA AAAATACAAC AAGCGAAGAT 1860 

AGTATTCCAA GCGATGCGTA TGCGCAAATT 1920 

TTAGTAGTCG ACGCTGAAAA AGAATTGGCT 1980 

ATTAAACCTA ATAAAGATGA ATTAGAAGTG 2040 

GATGTTATTA AATATGGTCG TTTGTTAGTT 2100 

CTTGGCGGTG ATGGTGCTAT TTATATTGAT 2160 

CAAGGGAAAG TGGTTAATAC AGTTGGCTCT 2220 

GGAATTGCTT CAGGTTTAAC GATTGAAAAA 2280 
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CGGGACGCTA 


TAGAAAAAAT AAAATCACAA GTTACGATTA GCGTACTTGA 


TGGGGAGTGA 


2400 




AAATAATGAG 


AGTAACAGAG 


TTATTAACAA 


AAGATACAAT 


AGCAATGGAT 


TTAATGGCAA 


2460 


s 


ATGACAAAAA 


TGGTGTTATT 


GATGAGTTAG 


TAAATCAATT AGACAAAGCA 


GGTAAATTAA 


2520 




GTGATGTCGC 


GTCATTTAAG 


GAAGCGATTC 


ACAATCGAGA ATCACAAAGT 


ACAACTGGTA 


2580 


10 


TCGGCGAAGG 


TATTGCCATT 


CCACATGCCA AAGTGGCCGC AGTTAAGTCA 


CCAGCTATTG 


2640 


CGTTTGGTAA ATCTAAAGCA GGOGTAGATT ATCAAAGTTT GGATATGCAA CCAGCACACT 


2700 




TATTCTTTAT 

4 r\ a. x a a. a. n a. 


GATTGeAGrfl 




GCGCCCAAAC ACATCTAGAT 


GCTTTAGCTA 


2760 


15 


AfiTTf5TCTY5f5 

AC X IVIUI w 


ini x X innltf 


RATYSAAAATft 


TACGTGAGAA ATTATTACAT 


GCTTCATCAC 


2820 




fTR A AR AAC3T 


An. X f\\J\*\Jf\X \- 


A T A fll\ TY2 AflT" 


CTGATGATGA 


AGTGACAAAA 


GAAGAAGAGG 


2880 




CAG A AH CTG A 


Ar3r , Ar , Aar'L a 


V» x x VJWvil. 1 0 


CAGAACAATC ATCTAAACAA 


TCT AATGAG C 


2940 


20 


CATATGTGTT 


AGCAGTAACT 


GCTTGTCCAA 


CAGGTATTGC 


ACACACATAT 


ATGGCACGTG 


3000 




ATGCATTGAA 


AAAGCAAGCG 


GATAAAATGG 


GTATTAAAAT 


TAAAGTAGAA 


ACGAATGGTT 


3060 




CAAGCGGCAT 


TAAAAACCAT 


TTAACTGAAC 


AAGATATTGA 


AAATGCAACA 


GGTATCATTG 


3120 


25 


TTGCTGCTGA 


TGTTCATGTT 


GAGACGGATC 


GCTTCGATGG 


TAAAAATGTC 


GTAGAAGTAC 


3180 




CAGTAGCAGA 


TGGTATTAAA 


CGCCCAGAAG 


AATTAATTAA 


TAAAGCATTA 


GATACAAGTC 


3240 


30 


GTAAACCTTT 


TGTTGCCCGT 


GATGGTCAAA 


GAAAAGGTAA 


CTCAAATGAC 


AGTCAAGAAA 


3300 


AATTAAGCCC 


AGGTAAAGCA 


TTCTATAAAC 


ACTTAATGAA 


CGGTGTTTCT 


AACATGTTGC 


3360 




CACTTGTAAT 


ATCTGGTGGT 


ATTTTAATGG 


CAATTGTATT 


TTTATTTGGA 


GCAAATTCAT 


3420 


35 


TTAATCCAAA 


AAGCTCAGAG 


TACAATGCGT 


TTGCAGAGCA 


GCTTTGGAAC 


ATTGGTAGTA 


3480 




AAAGTGCATT 


CGCGTTAATC 


ATTCCAATTT 


TATCTGGATT 


CATTGCACGT 


AGTATTGCGG 


3540 




ATAAACCTGG TTTCGCTTCA GGTCTTGTAG GTGGTATGTT AGCAATTTCA GGTGGTTCAG 


3600 


40 


GATTTATTGG 


TGGTATTATT 


GCAGGTTTCT 


TAGCAGGTTA 


CTTAACACAA 


GGTGTTAAAG 


3660 




CCATGACACG 


TAAGTTACCA 


CAAGCATTAG 


AGGGATTAAA 


GCCAACATTA 


ATTTATCCAC 


3720 




TATTAACAGT 


GACGGCTACA 


GGCTTATTGA 


TGATTTATGC 


CTTTAATCCA 


CCAGCATCTT 


3780 


45 


GGTTAAATCA 


TTTGTTATTA 


GATGGATTAA 


ACAATTTATC 


AGGTTCTAAT 


ATTGTATTAT 


3640 




TAGGTTTAGT 


TATTGGCGCT 


ATGATGGCGA 


TTGATATGGG 


CGGTCCATTC 


AACAAAGCGG 


3900 


50 


CATATGTTTT 


TGCAACAGGT 


GCGTTGATTG 


AAGGTAATGC 


AGCACCAATT 


ACAGCTGCAA 


3960 


TGATTGGTGG 


TATGATTCCA 


CCGTTAGCAA 


TTGCGACAGC 


GATGTTAATT 


TTTAGACGTA 


4020 




AATTTACAAA 


AGAACAACGT 


GGTTCAATTA 


TCCCTAACTA 


TGTGATGGGT 


ATGTCATTTA 


4080 
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TGATTGGTTC AGGTATAGGT GGCGCAATTG 
CACATGGTGG TATTATTGTA ATTGTTGGTA 
5 TTGCACTTCT AGTTGGCACA TTAGTTTCAG 

TAACTGAAAC AGAAATCGAA GCTTCAAAAT 
TGATTGTTAG CAAAGAGCTT CATATTAAGT 

10 

TATATCGTGT TAACGGTAGC TTATACAAAG 
TTATGAATTG ATATGAAAGT GTTTTTATTT 

15 CAAATGTATA GACTTTTTTA ATATTTTGCA 

AAAATATGAG TGTCTTAAAG TGAAAATTTA 
TTAATTATAT ATAACGGCAA AGTTTATACT 

20 CATGTGAAAG ATGGACAGAT TGTTGCAATT 

AATGATACGA CAAATAAAAT TCAAGTGATT 

TTTATTGATA TACATATTCA TGGTGGTTAT 

25 

GGCTTAAAAT ATCTATCCGA AAATTTGTTG 

ACAATGACGC AATCGACTGA TAAAATAGAT 

GCGGAgCAAG ATGTTCAGAA TGCAGCGGAA 

30 

ATATCTGAAA ATAAAGTTGG TGCTCAACAT 
AAAATTAAAC ATTTTCAAGA GACTGCTAAC 

3S GAAATTGAAG GTGCAAAAGA AGCGCTTGAA 

GGTCATACAG TAGCAACATA CGAAGAAGCA 
GTCACGCATT TATATAATGC AGCGACGCCA 

40 GCAGCATGGT TGAATGATGC TCTACATACC 

CCGGCATCGG TTGCAATTGC TTACCGTATG 
GATGCAATGC GTGCAAAAGG TATGCCTGAA 

45 

ACTGTTCAAT CGCAACAAGC ACGTCTTGCA 
ATGAATCATG GGTTACGTAA CTTAATATCA 
CGAGTAACAA GTTTAAATCA AGCCATTGCA 

50 

AAAGTAAATA AGGATGCAGA TCTTGTTATT 
ATAAAACAAG GCAAGGTTCA CACATTTAGC 

55 



CTTTAGGCTT 


AGGTTCACGA 


ATTACTGCGC 


4200 


CTGATGGTGC 


ACACTTACTT 


CAAACTCTTA 


4260 


CATTAATTTA 


CGGTTTAATC 


AAACCAAAGT 


4320 


CAATGGACGA 


GTAGTTTTAA 


TGATGTAAAA 


4380 


TGTATGTTCA 


ATGAATATAT 


GTTAGTTTTA 


4440 


CTGTAAAAAC 


ACTTTCTATT 


AATTCAGTTT 


4500 


TTAGATAAAT 


GAATGAAGAA 


ATAGACACCA 


4560 


AAAAGTTATG 


CCAAACGAAG 


CAGATATAGT 


4620 


TAAATAAAGA 


AGGGTTTATA 


CGTGTCAGAA 


4680 


GAAGATGGCA 


AAATCGATAA 


TGGTTACATT 


4740 


GGAGAAGTGG 


ATGATAAAGC 


AGCAATTGAT 


4800 


GATGCTAAAG 


GTCATCATGT 


ATTACCAGGT 


4860 


GGTCAAGATG 


CAATGGATGG 


GTCATACGAT 


4920 


TCTGAAGGGA 


CGACATCATA 


CTTGGCCACT 


4980 


AATGCACTTA 


CAAATATTGC 


TAAATATGAA 


5040 


ATTGTAGGTA 


TACATTTAGA 


AGGACCATTT 


5100 


CCGCAATACG 


TTGTACGCCC 


ATTTATCGAT 


5160 


GGATTAATAA 


AGATTATGAC 


GTTTGCACCT 


5220 


ACGTATAAAG 


ATGACATTAT 


TTTTTCAATT 


5280 


GTTGAAGCTG 


TTGAGCGAGG 


AGCTAAACAT 


5340 


TTCCAACATA 


GAGAACCAGG 


TGTTTTTGGA 


5400 


GAAATGATTG 


TTGATGGCAC 


TCATTCTCAT 


5460 


AAAGGTAATG 


AACGTTTTTA 


TTTAATTACC 


5520 


GGAGAATATG 


ATTTGGGTGG 


ACAAAAAGTA 


5580 


AATGGTGCGC 


TTGCTGGTAG 


TATTTTAAAA 


5640 


TTTACAGGTG 


ATACATTAGA 


TCATTTATGG 


5700 


TTAGGTATCG 


ATGATAGAAA 


AGGTAGTATT 


5760 


CTAGATGATG 


ATATGAATGT 


AAAATCTACA 


5820 


TAATAAATAA 


TCATAATTAA 


ATGTATGCAA 


5880 
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TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 6000 

ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 6060 

AATAGGGAAT TAATTGGAAA CTTCGACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 6120 

TGCATTAACC ACTGTATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 6180 

AATTGAACAG CTAGCAGATG AAGGAAATAA ACCTGCTAAA ATAGTAAAAA AGATGATTGC 6240 

TAATCTAGAT TATTATCTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 6300 

AGGTTGGCTT GGTGAACCAA CGTTTGAAAA GCTATTACAC CCAATATTTG AAGCAATCAA 6360 

TTTACCAACT GCATTAACGA CGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTACGTA 6420 

TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 6480 

GCTTGCTTTA GTATATGCAA GACCATTGTT CTATTTCGGT AACATTATGA AACCATTGAT 6540 

20 TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 6600 

CCAAACTGAT GCAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 6660 

TGGAGAAATC AACCAAACTG AATTGGCATA TATGCAAAAT ATCTTTTCAT TCGATGAAAG 6720 

ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 6780 

TGTAGACGAA TT ACT AG AAA CAATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 6840 

TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 6900 

CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTGCCAA TGATTTCAGA 6960 

GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 7020 

TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 7080 

AATCGTTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT 7140 

TGATBATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 7200 

40 CGGTATAGAA TTTGATGACT CTGAGGATAT TGATACGATA GGTGGATGGT TACAATCTCG 7260 

TAATACCAAT TTACAAAAAG ATGATTACGT GGATACAACT TATGATCGCT GGGTTGTTTC 7320 

AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAGCGAG 7380 

ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 7440 

AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 7500 

GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 7560 

CGA 7563 
(2) INFORMATION FOR SEQ ID NO: 34: 
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(A) LENGTH: 3492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 





(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 34: 






10 


TTATATCAAC 


TTCATGGCGG 


AACCATTGAT 


GACCCATTAG ACGAAACAAT 


AAGCGCATTT 


60 




sATGAATTGA AACAAGAAGG 


AATTATACGT 


GCTTACGGTA TTTCTTCTAT 


TCGCCCAAAT 


120 




GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 


180 


15 


ATTGATAATC 


GTCCAGAATC 


ATTATTAGAT 


GCAATTCACA ACAATGATGT 


TAAAGTATTG 


240 




GCAAGAGGAC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 


300 


20 . 


AAATTTAAAG ATGGTATTTT 


TGATTATTCT 


CATGATGAAT TGGGTGAAAC 


AATAGCCTCT 


. 360 


ATTAAAGAAA 


TTGAAAGTAA 


TTTATCTGCA 


TTGACATTTA GTTATTTAAC 


ATCACATGAC 


420 




GTGCTTGGTT 


CCATCATTGT 


AGGTGCAAGT 


AGCGTCGACC AATTAAAAGA 


AAATATTGAA 


480 


25 


AACTATCATA 


CTAAAGTTAG 


TTTAGATCAG 


ATTAAAACAG CAAGAGCTCG 


TGTAAAGGAT 


54 0 




TTGGAATATA 


CCAATCATTT 


AGTGTAGAAG 


TCATTTTCAG TAATAAAAAC AGCAGCATGA 


600 




GGCGTTTCAT 


TATAAAAATG 


CCTTACTGCT 


GTTGTTTATG TACAATTCGC 


TATAATTTAT 


660 


30 


GATTATGATT 


ACTCACTTAT 


GATAGAAATT 


AAAGCGTTGT CCTCACGCAT 


CAGTATTTAG 


720 




TAATTTCGCC 


TTGCGGCATT 


GCCTTAAGCA 


AACTTCTGCC ACTTCATCTC 


TTAATAATTT 


780 




TATTAAAACA TCTTTCTATA 


TTTCACTTCG 


CATGTTGATT CATCATTATT 


AGTTATTATT 


840 


35 


TGTACACCCA 


GCACATTTCC 


TTGCAACACA 


AGTAGTTTGA ATTTTTCACA 


AGTATAATAT 


900 




AATGTACCGT 


CTGAAATTTG 


GTCTACAGAA 


ATATCGCCTA AAATATCCAG 


CACTGTAAAT 


960 


40 


TCTTCAAATA 


CTGATAGTTG 


TTCCGCATAT 


CGTACACAAA GTCTTACCAC 


ACTCTCCGAT 


1020 


TGACAGTTCA 


TTGCCATCCC 


ACCTATTTAT 


GCTTTATTTT TAAATAATTT 


AGGGAAACAT 


1080 




CGTTCAAAAA 


ATCTAGGCGC 


AATTTGATAC 


ATTTTCAACG CATGaTGCAT 


CCATTTAGGC 


1140 


45 


CGATTAATTT 


CCAATTGTTT 


TGTTTTAATG 


CCATAAATGA TATCTTCTGC 


AAGCTGATTA 


1200 




GCATCAAGCA 


TAATTTCCCC 


CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 


1260 




TGAAAAGGTG 


TATCAATCGG 


GCCAACATTA 


ACTGTCATGA TATGTAAGTT 


TGGTGACTCT 


1320 


50 


AGTCTTAAAG 


CATTCATTAA 


TGCATAAAAC 


CCTGCTTTCG ATGCCCCATA 


ATGTGCAGCA 


1380 




TTTGCTTGTG 


TGGAAAATGC 


AGCTTGACTT 


GAAATACCTA CAATATGTGC 


GTTAGATGTT 


1440 




AAATATGGTC 


TCAACACAGT 


ATATAAAACA 


TTAAAACTAA TTAAATTAAG 


CTGATACGTT 


1500 
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TAAATGAATC CATCGAATGA TGTATTGTCT TCAAATTGCA GTGCCTGTAT CGACTTCAAA 1620 
TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA ■ 1680 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 1740 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTC 1800 

ATCAATTAAC CTTCCTTTTC AATTATATAG AATGCAATTT ATCAACTTTA CATAATTGAG 1860 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 1920 

AAAACTAAAG GGATGTGaCG TTAATGrAAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 1980 

15 AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTAGA TGCCAATGAT ATATATTTAA 2040 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GTTAACTATA 2100 

GTTATGAtGA TG CG ACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 2160 

20 ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTTCAATTAT GGCAGGTATT CCGATTGATT ATATTAAACA ACAATTAGAA TGCCAAAATC 2280 

CaGTTGCTAG AATTATGCCA AACACAAATG CGCAAGTTGG ACACTCTGTT ACTGGCATTA 2340 

GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAGCAT 2400 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT ATCACCGGAA 2460 

GCGGCCCAGC ATTTTTATAT CATGTATTCG AGCAATATGT TAAAGCTGGT aCsAAACTTG 2520 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 2580 

TGATTGAACG TTCAGAtTTG AGCATGGCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 264 0 

GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG 2700 

ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GACCAATAAA 2760 

AACAffACCCG CCAACACATG TATGCATCAT CGCAAGCACT GTGTTTGACG GGTTATTTTT 2820 

40 ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG TCATTAGATT TTAAAACTAT 2880 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 2940 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 3000 

CATTGGCTTC TTTTATCAAT GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 3060 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 3120 

AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT 3180 

GTTTGGACTC ACCTCTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 3240 

CATGAGATTT TACTTCTTGA TATTTAGGAC CTGGTTCAAG ACCAATGTTT TTTAACGCTT 3300 
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CATGATTAAG TAAATGCGCC TCTACAGTAA AAC CATC CAT GATGATATGT CAGATGATCA 3420 

TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 34 80 

5 TCCACATATG CT 3492 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1973 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATCTAGCGGT ACAAGCGTCT TGGAGGCTAG TATGTTGAAC ATTGTAAACC CTGAAGATCA 60 

20 CTTCGTTGTC ATTGTTTCAG GTGCCTTTGG TAACCGATTT AAACAAATTG CACAAACTTA 120 

TTACAAAAAT GTGCATATTT ATGACGTAAC ATGGGGAGAA GCTGTAGATG TCAAAGATTT 180 

CATCAATTTC CTTTCAACTT TAAATGTTGA AGTTAAAGCA GTATTTAGTC AATATTGCGA 240 

25 AACATCTACG ACAGTGCTAC ACCCTATTCA CGAGTTAGGA AATGCCATTA ATCAATTTAA 300 

TAGTAATATT TATTTTGTAG TTGACGGCGT AAGTtGCATT GGTGCTGTTG ATGTTGACAT 360 

TAACAAAGAT AAAATTGATG TACTTGTTTC TGGTAGTCAA AAAGCAATTA TGTTACCTCC 420 

30 

AGGATTAGCT TTTGTAGCTT ATAGCCACCG TGCAAAAGAA CATTTCAAAG AAGTAACTAC 480 

GCCAAAATTT TATCTAGACT TAAATAAATA CATTTCGTCA CAAGCTGACA ATTCTACACC 540 

GTTCACACCA AATGTGTCTT TATTTAGAGG TGTAAATGCA TACGTTGAAA CCGTAAAAGC 600 

35 

AGAAGGTTTC AATCACGTAA TAGCACGACA CTATGCAATT AGAAATGCAT TAAGAAGCGC 660 

CTTAAAAGCA TTAGATTTAA CTTTATTAGT CAATGATAAA GATGCATCTC CAACGGTTAC 720 

40 AGCATTCAAA CCTAATACAA ATGATGAAGT GAAAATAATC mAAGATGAAC TTAAAAATnG 780 

CTTTAAAATA ACAATTGCnG GTGGTCAAGG CCATCTTAAA GGTCAAATTT TnAGAATTGG 840 

TCATATGGGG AAAATTAGTC CTTTCGATAT TTTATCGGTA GTATCTGCTT TAGAAATTAT 900 

45 TTTAACTGAA CACCGTAAAG TTAACTATAT CGGTAAAGGT ATATCAAAAT ATATGGAGGT 960 

TATTCATGAA GCAATTTAAT GTACTCGTTG CAGATCCCAT ATCAAAAGAT GGTATCAAAG 1020 

CATTATTAGA TCACGAACAA TTCAATGTAG ATATTCAAAC TGGCTTGTCC GAAGAAGCAT 1080 

50 

TAATCAAAAT TATACCTTCA TACCATGCTT TAATCGTTCG TAGTCAAACT ACGGTTACTG 1140 

AAAATATCAT AAATGCTGCT GATTCTTTAA AAGTAATCGC ACGCGCCGGT GTTGGTGTAG 1200 
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GTAATACGAT TTCAGCTACT GAACATACAC TGGCAATGTT ATTATCAATG GCACGAAATA 1320 

TTCCGCAAGC ACACCAATCA CTTACAAATA AAGAATGGAA TCGAAATGCA TTTAAAGGTA 1380 

5 

CTGAGCTTTA TCATAAAACA TTAGGTGTCA TTGGTGCTGG TAGAATTGGT TTAGGTGTTG 1440 

CTAAACGTGC GCAAAGTTTC GGAATGAAAA TACTAGCTTT TGACCCTTAC TTAACGGATG 1500 

AAAAAGCAAA ATCTTTAAGC ATTACGAAGG CAACAGTTGA TGAGATTGCC CAACATTCTG 1560 

10 

ATTTCGTTAC ATTACATACA CCACTAACAC CTAAAACAAA AGGCTTAATT AATGCTGTCT 1620 

TTTTTGCCAA AGCAAAACCT AGTTTGCAAA TAATCAATGT GGCACGTGGT GGTATTATTG 1680 

75 ATGAAAAGGC GCTAATAAAA GCATTAGACG AAGGACAAAT TAGTCGGGCA GCTATCGATG 1740 

TGTTTGAACA TGAACCTGCA ACTGACTCGC CTCTTGTTGC ACATGATAAA ATTATTGTTA 1800 

CACCTCATTT GGGTGCTTCA ACAGTCGAAG CTCAAGAAAA AGTGGCAATT TCTGTTTCAA 1860 

20 ATGAAATCAT CGAAATTTTA ATTGATGGTA CTGTAACGCA TGCAgTGAAT GCACCTAAAA 1920 

TGGACTTAAG CAATATAGAT GATACTGTAA AATCATTCAT CAATTTAAGC CAA 1973 

(2) INFORMATION FOR SEQ ID NO: 36: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7620 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

35 GGTGTTTCAG ATGTCACTGG TTGATTTTTA ATTGTAGACG GGTATTTTGG GCTTTCGCCA 60 
TATTTATTTG CCGGCTTACT GTCAAAGCAT AGGAATACTA TCATAACAAT TGTTAGGCCT 120 
• AAAT5AACAA AATAAAGAAG TACTAACAAA ATATTAAGAC CCATCGGCAT TAATGTAAAA 180 

40 TCACTGTCAT AATAACTATC GATAATCTGT AATACTATAT AAAATATAAT ACTGAATACT 240 
GTCATAATCA TTGGAAATAA CATTGTTCTT GATATATCGT GAAATCTTCG AACGCACAAC 300 
GCTAAATTTG GAATAAACGT TGCCAAACTA TAGACAAAAG TATACACAGA TGTAAGGATA 360 

45 ATCATCAATA TACTCATAAC TATTAATGTT TCGTTATCCG CCGCTATAGA AATAAAGAAT 420 
AGAAATAGGT TTATTATTAG CACACACACA GCTGGAACCA TAAGTATCAA ATGCCATAGT 4 80 

GCCATATACC AATATTCACT ACGTCTTGAT CTCCCCTTAA AATTTACATA ATTTTTCCAA 540 

SO 

AATAAAACGA ATGATTTCAT AAAACCTACT TGAGGTAATT GTTCCATTGT AATCTCCCTT 600 
TCGTTAATCA TATTTATATT TTTAATTATT GTTACCGTTA TAATTTACAA GATTCATTAT 660 

55 



329 



EP 0 786 519 A2 



GTAAAATGAA AACCCGCTAC AAGTACACAT CTATATGGAG ACTCATTTGA AAGTCAACGC 780 

TTCGTTAACT ATACTAAAAA TATGTCATAC TGCAATGTTC ACGTTTAAAA GAGTCTCAAT 840 

5 

CTATGCAAAT AAAATATTCC ATAACAAAGT ATATACTTTA CATTTTTATA ATTCTTAACA 900 

ATACTATTTT ATCAAACATT TACCACAATA AAAATATCTT TTTCATTTTT ATTTAAATTA 960 

ATCATATAAT TGCGAGGAGA ATATTATGGA TTTCGTTAAT AATGATACAA GACAAATTGC 1020 

10 

TAAAAACTTA TTAGGTGTCA AAGTGATTTA TCAGGATACC ACTCAAACGT ATACAGGCTA 1080 

CATCGTGGAA ACGGAAGCTT ACTTAGGTTT GAATGATCGT GCGGCTCATG GCTATGGCGG 1140 

15 TAAAATAACA CCTAAAGTCA CGTCATTATA TAAACGTGGT GGTACAATTT ATGCACATGT 1200 

CATGCATACG CATTTACTCA TTAATTTTGT AACAAAATCT GAAGGTATAC CTGAAGGCGT 1260 

ACTTATCCGC GCAATTGAAC CAGAAGAAGG TTTATCCGCT ATGTTCCGTA ACAGAGGTAA 1320 

20 GAAAGGCTAC GAGGTAACGA ATGGCCCAGG AAAATGGACT AAGGCATTTA ACATTCCACG 1380 

GGCTATCGAT GGCGCTACGT TAAATGACTG TAGATTGTCT ATTGATACTA AGAATCGTAA 1440 

ATATCCTAAA GATATTATTG CTAGTCCACG AATCGGTATT CCAAATAAAG GTGATTGGAC 1500 

25 ACATAAATCT TTACGTTACA CAGTGAAAGG TAATCCATTT GTGTCTCGCA TGCGTAAATC 1560 

AGATTGTATG TTTCCCGAAG ATACTTGGAA ATAAATGCCA TCTTTCATTG ATTACTATCA 1620 

TGAAAATGAA ATCTATCTCC TTATAAGTCA ATCAATCGTG CCGTCAACAT GCGGATGGGT 1680 

30 

TGATTGTTTT TCTTTGTATC CATCATATTT TTTGATTCAT CTCCTCTTAT TGAACTTGTT 1740 

CTTAATTATA AAATATAACA ATAGAATTAT TTATAATTAT TAAATTTAGA TGCATTAATA 1800 

TTATTGATAT TATTTTCAAA AACTAGAAAT ATTGATTTGT TGCATGTATA ATGTTAAAAG 1860 

35 

CGCCCTTTTA TAACGCTTAC ATATAAAAGC TTATTTAGGG AGAGGGATAT TCAACAAGGG 1920 

GGATTTGAAA ATGATAGAAC TTAATGCAAT TACAACATTA TGTTTAGCTT GTATCCTTTA 1980 

40 TTTACTTGGT AAGGCTATCG TTAATCACGT TAATTTTTTA AAACGTATTT GTATACCAGC 2040 

ACCAGTGATT GGCGGCTTAA TCTTTGCTAT TTTAGTTGCG GCTTTGGATT CATTTGGCAT 2100 

GGTTAAGATT AAATTAGATG CTTCATTCAT TCAAGATTTC TTCATGTTAG CATTCTTTAC 2160 

45 GACAATCGGT CTTGGTGCAT CATTGAAATT ATTTAAATTA GGTGGCAAAG TCTTGCTATT 2220 

ATACTTTATG TTTTGTGCTA TCATTTCAGT CATTCAAAAC ATAGTTGGTG TATCACTAGC 2280 

AAAAGTATTA AATATTAAAC CTTTGTTAGG ATTAACAGCA GGTTCCATGT CTATGGAAGG 234 0 

50 

CGGTCATGGT AATGCTGCTG CTTATGGTAA GACAATTCAA GATTTAGGTA TTGATTCGGC 2400 

ACTGACAGCG GCTCTTGCAG CTGCAACTTT AGGTCTTGTA TTTGGAGGGC TTATCGGTGG 2460 
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ATTTAAAGAT TATAGCCAAG TAGCATATAA 
TGAAGTATTC TTCATTCAAT TTACAATCGT 

5 

CAGTCATTTG TTTACAGCTC AAACAGGGAT 
TGTAGCTGTT ATTGTCCGAA ATATCTCTGA 
AATTACTAAT CAAATTGGCG ATGTCGCATT 

10 

CATTCAATTA ATCGAAATTT ATAAACTTGC 
AGTTGTCGTT ATGATTTTAT TTGCTGTTTT 

15 TGATGCTGCA GTAATGGTAG GTGGTTTTAT 

ATGGCAAATT TAGATGTTAT TACTAAAAAA 
GTACCTATTG TTGGTGCATT CTTAATCGAT 

20 ATACAATGGT TTAGTTAAAC ACCAAACTCA 

TTTATTTATC CTCGATGTAT ATTCAAGTTA 

CTAAATACGA TTTGTTTTTG TGTTAAGTCG 

25 

ATTGATTTCA TGTGTTCAAT AAATGATTCT 

TTTTCAACTT GATTTAAAAA CGGACGTGAC 

CTTTTAATTG CATCGAGTGG TGTACGTAAA 

30 

TGATAAGCCG TTGTTTCAAG TAATGACTCA 
TCCATATCTT TATTTGCACG ACGTTCATTT 

3S CCACTAACAT CGACATACTT GACGCCTATT 

ATACCAAATC CAACTTCTTT TATAATGACT 
ATAOATCTA ACCAAGTCAC AAATTCACGA 

40 GAATTAACAT GGATTTGTAA CGCTTGTGCC 

TCTACTGGTA CGTCCGCACC AACATTGCTA 
GCAATCGTAA ACGTCTCAGC CATGCGTGGA 

45 GCCATCGCTA AGCCAGTTTC TCTTGCAACT 

CACTCGCTAC CACCCGTCAT TGCATTAATA 
GTCTGTGATG TCAAATCGAT ATCATTTACA 

50 

CGCATCTTAT CAAAATCTGA ATGCATTGCG 
TCATTTTTTC TCTGTTCTCT TTGAAAATCA 

55 



CGAACATTTA 


CATAGTAAAT 


TTAATGCCAC 


. 2580 


TGTATTCTGT 


ATGGCAGTTG 


GAAGTTATTT 


2640 


TAATGTTCCA 


ATTTACGTTG 


GCTCATTATT 


2700 


AAGTTTTAAT 


TTTAATATTG 


TAGATTTAAA 


2760 


AGGTATTTTC 


TTATCTCTTG 


CGCTAATGAG 


2820 


TATACCTCTT 


ATTATTATCG 


TTTTAGTTCA 


2880 


AATTTTATTT 


AGAGGTTTAG 


GAAAAGATTA 


2940 


CGGTCATGGG 


CTTGGTGCAc 


GCCAAATGCC 


3000 


TATGGAAACT 


CACCTAAAGC 


ATATTTAGTT 


3060 


TTAATTGGTG 


TTATAGTCAT 


TATGGGATTC 


3120 


TAAATAAAAG 


AGGAGGCCTT 


CGCCTCcTcT 


3180 


CGTTGTTCTA 


TCCATGACAA 


TATTTCCGGA 


3240 


TCAATATTTT 


TAGCATCTAA 


CATCGTCATT 


3300 


ACATAAGCTA 


CTGTATGTGC 


AATGCCATTA 


3360 


ATACCAGTTG 


CCTTTGCACC 


AAGTGCTAAA 


3420 


CCACCACTCG 


CGAAAACTGA 


AATTTCGCTT 


3480 


ACTGTAGACT 


GTCCCCATGA 


TGATAAGTAA 


3540 


TCAATATCTA 


CAAAGTTAGT 


ACCACCTTTG 


3600 


TGTTGTAAGT 


CATGCATTAA 


TTCTTTGCTC 


3660 


GGAACAGACA 


CTCGTGATAC 


AATCGACGCT 


3720 


TTCCCTTCAG 


GCATAACTAA 


TTCTTGAGGA 


3780 


TCAAGTAATT 


CAACTGCTTC 


CAAAGCCTTT 


3840 


AAAATCATGC 


CTTCAGGATT 


CATTTTTCGC 


3900 


TTTCTCAATG 


CCGCATGTGT 


TGATCCAACT 


3960 


ACAGCTAGCT 


TTTCATTGAT 


GTTTTTCGTC 


4020 


TAAACCGGAT 


ATGCCATCGT 


TAAGTCAGGC 


4080 


TTAATTGATG 


GGATAGAATG 


ATGCACAAAA 


4140 


TCAGATTGGG 


CCATTGCTAT 


TTCAACATGT 


4200 


CTCATGATTA AACCTACCTT 


TTCGTCATTT 


4260 
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10 



ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 4380 

TTCATACCAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 444 0 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCTCGTATT 4 500 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACGTT GCTTAATCAA GTGGTCATCG 4 560 

ATATGTTGAA TGTATAGCGA ATGTTTATTA TCTATAATCA AATCACCATT TTGTTTCATT 4620 

GTATCAATTA GCTCTTGCAT AGGAAACAGT ACACGTTTTA CTTTAATCAA ATCCGAACGT 4680 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCGAT CATCTACATG GCGGTCTTCA 4 74 0 

15 AAATGATAAA CACTATCTTC AAGTGCATAT ACAAAGTTGA AATATTTATC AACCATCATA 4800 

TCTAAAATTA ATATGACGAC ATCTGCACAA TCTAATTCTG CATCTAATGT ATTCATATAC 4 860 

TTATAGACTA CTTTATTTAA TGATTCCAAC GTTTGATGAT GATATGTTAC TAATACATTG 4 920 

20 TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4980 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGGCATACC GTTAATTGCA 5040 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCTTTAA ACCATTCATT TTCTTGTTCA 5100 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 5160 

TCAACTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 5220 

CCCCCTAAAT AACGAATTCT CTATTATTTT ATCATGAATT AAATAACGTG TATGTCTTAA 5280 

TTTATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 534 0 

CTCAAACCGT AACATCTGAA TTCATTGACC ATAACAATCA TATGCATGAT GCAAATTATA 5400 

ATATCATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG CCACGGTCTT TCTTTAAAAG 5460 

AACGCGAAAA TTTAGCATAT ACGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 5520 

AATTGTCTCT TGGCGATGTA TTTACTGTTA CTTTATATAT TTATGATTAC GATTATAAGC 5580 

40 GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGGTAC ACTAGCATCA ACAAATGAAG 564 0 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT GAATCATTTT 5700 

CAACACAAAT AGCACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 5760 

GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5820 

GATTCATATC GAATTACTAG ATTTATTAGA TGATGTTAAG TTTGAATTAA CAGAATTAAA 5880 

TGCACAAAAA GGGTTATACA TTAACGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGCA 594 0 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTTCAAAAT GAGGTTGCTC 6060 
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5 



- 10 



15 



20 



25 



30 



35 



ATAATTTTTT AGATCAATTT 


TATCAAATTA 


AAGGGCAATA CTTTATCATC 


ACACATATCA 


6180 


ATACACTTAT TGGTGATTTT 


CACTCAGAAG CTCATTAACA ATTAGTCTAT ATAACCCTTG 


6240 


CTATATTTTC AAAAACAAAA 


CCCAATTACG 


TTTTCATGTC AAATATCATC 


TTGCATGAAA 


6300 


TCGTAACTGG GTCATTTATA 


TGTTATTAGT 


TATTTTGTGT TACATCCTCA 


TCTATCGATT 


6360 


TGGCAATTTG TTTAATAGCT 


TTATGTGATT 


GTCTAATTGG ATAAATTGGA 


AAATCATGTA 


6420 


CCATCTTAGG ATAATCATAA AACTCAATGT ATTGATGATG TTGCAACATC ATTTGTTCAA 


6480 


ATAGCTTCAT ATCAGGATGT 


GTCATTTCAC 


GTCCACCACC AAACATATAA 


ACTGGTGGCA 


6S40 


ATCCTTCTAT TGTGCCATTA 


ATTGGCGATA 


TGCGCTTATC TGTTAATGGT 


AGGCCATTCG 


6600 


CCCATTTTTT CATAATCTCA 


TTGACACCAA 


ACTGACTTAG aACCGCATCT 


TGTTCGATTA 


6660 


AGGCGTCCGA AATATCTTTA 


TTAGATAGTG 


TTGCATCTAA AATTGGTGAG ATTAAATACA 


6720 


ATTTATTCGG TAATGGCTGT 


TGATTAkCTA 


AAAGAGATTG TACAAAGGAT 


AATGCCAGTG 


6780 


CACCACCTGA ACCATCACCC 


ATGACTACGA 


CATTTTGATG TCCTACTTCA 


GATACTAATT 


6840 


GaTCATAAAC ACGTTGTATC 


GCTTGGnAAA 


GTATCGTCaA TATGnAAACT 


CTGGTGTCTT 


6900 


TGGATAGATA GGCAGTACAA 


CCTCATATAA 


TGtACTTAAA GTGATTTTAT 


CCCAACAATC 


6960 


TCCAATGGAA CGGTGATGGT TGTAGTGCAT TGAATCCACC GTGAATATAT AAAATTTTCT 


7020 


TATCAATTTG ATGTCTGAAA TTAAAGCGAA AGACTTGCAT ATCATCTAAT GACAATTTTT 


7080 


CTAAATTTGC TTTAACATTT AATGTTGAAG 


GCTGCTTATG TTTTTTTCTA 


TTTTCAATTT 


7140 


CTCTTTTATA AAAAAATCTT 


TCAACATCTT 


GATCATTTTT AAACATAATC 


GAGCGATTGT 


7200 


GAAGCAAATA TTTATTGACA 


ACGCTATTCA 


TAACACGGTT TCTAATCAAT 


GTCTTAACCT 


7260 


ACCTTTATAT ATTTTATGTA TCCAATGATk 


GTCTATCCCC TACATTCTTT 


GCCAAAAAAA 


7320 


GTATATAATG TAGAAGATAT 


TTTCTTTTTC 


ACTTTCAAAT TTAAGACTAC 


AATTGAACAG 


7380 


TGATTTTTCA TCATTATAAC 


AGACAACTAG 


ACATATTGAT AAGTAAAGAA 


AAGAACTTTA 


7440 


TACGGAGGTA CCTTGCATGA 


CAAATCCAAA 


TCAACGATTA GAACCATTTG 


ATGAGACATT 


7500 


TCAACAACCG AATATTCATC 


GTGGTAAGCG 


ATATGGTAAG AAAAAACGTT 


CATTGGTAAG 


7560 


CATGATTATT CAAATCATTG 


TTGTwATATT 


AACCACCATC GCTGGAATAC 


AGCATGGTGG 


7620 


(2) INFORMATION FOR SEQ ID NO: 37: 







(i) SEQUENCE CHARACTERISTICS : 
50- (A) LENGTH: 9834 base pairs 

(B) TYPE: -nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 





GTCATtACCG 


amTTTCtTAG AaTCATTTAA 


AGATGATAAA 


TATACAAACG 


TTGGTAATTT 


60 


5 


AAAAGAAGTG 


AATTTTGATA AAATTGCTGC 


GACGAAACCC 


GAAGTAATCT 


TTATCTCTGG 


120 




ACGTACAGCT 


AATCAAAAGA ATTTAGATGA ATTCAAAAAA 


GCTGCACCTA 


AAGCGAAAAT 


180 




TGTTTATGTT 


GGTGCAGATG AAAAGAACTT AATTGGTTCA ATGAAACAAA ACACTGAAAA 


240 


10 


• TATCGGAAAA 


ATTTACGATA AAGAAGATAA AGCTAAAGAA TTAAATAAAG 


ATTTAGATAA 


300 




CAAAATTGCT 


TCAATGAAAG ATAAAACGAA AAACTTCAAT AAAACTGTTA 


TGTATTTACT 


360 


15 


AGTTAACGAA 


GGTGAATTAT CAACATTTGG 


ACCTAAAGGT 


CGTTTTGGTG 


GATTAGTTTA 


420 


CGATACATTA 


GGATTCAATG CAGTTGATAA AAAAGTAAGT 


AATAGCAATC 


ATGGACAAAA 


480 




TGTTTCTAAC 


GAATATGTTA ATAAAGAAAA TCCAGATGTT ATTTTAGCGA TGGATAGAGG 


540 


20 


TCAAGCGATA 


AGTGGTAAAT CAACTGCGAA ACAAGCATTA AATAATCCTG TATTAAAAAA 


600 




TGTTAAAGCA 


ATTAAAGAAG ACAAAGTATA 


TAATTTAGAT 


CCTAAATTAT 


GGTACTTTGC 


660 




AGCTGGATCA 


ACTACAACTA CAATTAAACA 


AATTGAGGAA 


CTTGATAAAG 


TTGTAAAATA 


720 


25 


ATTTTAAAAG 


AGGGGAACAA TGGTTAAAGG 


TCTTAATCAT 


TGCTCCCCTC 


TTTTCTTTAA 


780 




AAAAGGAAAT 


CTGGGACGTC AATCAATGTC 


CTAGACTCTA 


AAATGTTCTG 


TTGTCAGTCG 


840 




TTGGTTGAAT 


GAACATGTAC ,TTGTAACAAG 


TTCATTTCAA 


TACTAGTGGG 


CTCCAAACAT 


900 


30 


AGAGAAATTT 


GATTTTCAAT TTCTACTGAC 


AATGCAAGTT 


GGCGGGGCCC 


AAACATAGAG 


960 




AATTTCAAAA 


AGGAATTCTA CAGAAGTGGT 


GCTTTATCAT 


GTCTGACCCA 


CTCCCTATAA 


1020 


35 


TGTTTTGACT 


ATGTTGTTTA AATTTCAAAA 


TAAATATGAT 


AGTGATATTT 


ACAGCGATTG 


1080 


TTAAACCGAG 


ATTGGCAATT TGGACAACGC 


TCTACCATCA 


TATATTCATT 


GATTGTTAAT 


1140 




TCGTQTTTGC 


ATACACCGCA TAAGATTGCT 


TTTTCGTTAA 


ATGAAGGCTC 


AGACCAACGC 


1200 


40 


TTAATGGCGT 


GCTTTTCAAA CTCATTATGG 


CACTTATAGC 


ATGGATAGTA 


TTTATTACAA 


1260 




CATTTAAATT 


TAATAGCAAT AATATCTTCT 


TCGGTAAAAT 


AATGGCGACA 


scgTGTTTCA 


1320 




GTATCGATTA 


ATGAACCATA AACTTTAGGC ATAGACAAAG 


CTCCTTAACT 


TACGATTCCT 


1380 


45 


TTGGATGTTC 


ACCAATAATG CGAACTTCAC 


GATTTAATTC 


AATGCCAAAT 


TTTTCTTTGA 


1440 




CGGTCTTTTG 


TACATAATGA ATAAGGTTTT 


CATAATCTGT 


AGCAGTTCCA 


TTGTCTACAT 


1500 




TT AC CAT AAA 


ACCAGCGTGT TTGGTTGAAA 


CTTCAACGCC 


GCCAATACGG 


TGACCTTGCA 


1560 


50 


AATTAGAATC 


TTGTATCAAT TTACCTGCAA AATGACCAGG 


CGGTCTTTGG 


AATACACTAC 


1620 




CACATGAAGG 


ATACTCTAAA GGTTGTTTAG 


ATTCTCTACG 


TTCTGTTAAA 


TCATCCATTT 


1680 



55 



334 



10 



15 



20 



EP0 786 519 A2 

AGTGTTCTTT TTGAATAATG CTATTACGAT AATCTAACTC TAATTCTTTT GTTGTAAGTT 1800 

TAATTAACGA GCCTTGTTCG TTTACGCAAA GCGCATAGTC TATACAATCT TTAACTTCGC I860 

CACCATAAGC GCCAGCATTC ATATACACTG CACCACCAAT TGAACCTGGA ATACCACATG 1920 

CAAATTCAAG GCCAGTAAGT GCGTAATCAC GAGCAACACG TGAGACATCA ATAATTGCAG 1980 

CGCCGCTACC GGCTATTATC GCATCATCAG ATACTTCGAT ATGATCTAGT GATAATAAAC 2040 

TAATTACAAT ACCGCGAATA CCACCTTCAC GGATAATAAT ATTTGAGCCA TTTCCTAAAT 2100 

ATGTAACAGG AATCTCATTT TGaTAGGCAT ATTTAACAAC TGCTTGTACT TCTTCATTTT 2160 

TAGTAGGGGT AATGTAAAAG TCGGCATTAC CACCTGTTTT AGTATAAGTG TATCGTTTTA 2220 

AAGGTTCATC AACTTTAATT TTTTCATTTG GGATAAGTTG, TTGTAAAGCT TGATAGATGT 2280 

CTTTATTTAT CACTTCTCAG TACATCCTTT CTCATGTCTT TAATATCATA TAGTATTATA 2340 

CCAATTTTAA AATTCATTTG CGAAAATTGA AAAGAAAGTA TTAGAATTAG TATAATTATA 2400 

AAATACGGCA TTATTGTCGT TATAAGTATT TTTTACATAG TTTTTCAAAG TATTGTTGCT 2460 

TTTGCATCTC ATATTGTCTA ATTGTTAAGC TATGTTGCAA TATTTGGTGT TTTTTTGTAT 2520 

25 TGAATTGCAA AGCAATATCA TCATTAGTTG ATAAGAGGTA ATCAAGTGCA AGATAAGATT 2580 

CAAATGTTTG GGTATTCATT TGAATGATAT GTAGACGCAC CTGTTGTTTT AGTTCATGAA 2640 

AATTGTTAAA CTTCGCCATC ATAACTTTCT TAGTATATTT ATGATGCAAA CGATAAAACC 2700 

CTACATAATT TAAGCGTTTT TCATCTAAGG ATGTAATATC ATGCAAATTT TCTACACCTA 2760 

CTAAAATATC TAAAATTGGC TCTGTTGAAT ATTTAAAATG aTGctACCGC CAATATGTTT 2820 

TGTATATTTT ACTGGGCTGT CTAAGAGGTT GAATAATAAT GATTCAATTT CAGTGTATTG 2880 

TGATTGAAAA CAATTAGTTA AATCACTATT AATGAATGGT TGAACATTTG AATACATGAT 2940 

AAAGTcCTTT GATATTGAAA ATTAATTTAA TCACGATAAA GTCTGGAATA CTATAACATA 3000 

ATTCATTTTC ATAATAAACA TGTTTTTGTA TAATGAATCT GTTAAGGAGT GCAATCATGA 3060 

AAAAAATTGT TATTATCGCT GTTTTAGCGA TTTTATTTGT AGTAATAAGT GCTTGTGGTA 3120 

ATAAAGAAAA AGAGGCACAA CATCAATTTA CTAAGCAATT TAAAGATGTT GAGCAAAAAC 3180 

45 AAAAAGAATT ACAACATGTC ATGGATAATA TACATTTGAA AGAAATTGAT CATCTAAGTA 3240 

AAACTGATAC AACTGATAAA AATAGTAAAG AATTTAAGGC ACTACAAGAA GATGTTAAAA 3300 

ACCATCTCAT ACCTAAATTT GAAGCATATT ATAAGTCAGC AAAAAATTTG CCTGATGATA 3 3 60 

CAATGAAAGT TAAGAAATTA AAAAAAGAAT ATATGACGCT TGCAAATGAG AAGAAGGATG 3420 

CGATATATCA ATTAAAAAAA TTCATAGGTT TATGTAATCA ATCTATCAAG TATAACGAAG 34 80 
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AATTAGCTGA TAATAAAAGT GAAGCAACTA ATCTTACGAC AAAATTAGAA CATAATAATA 3600 

AAGCGTTAAG AGATACTGCG AAGAAGAACC TAGATGATAG TAAAGAAAAT GAAGTAAAAG 3660 

5 GCGCGATTAA AAATCACATT ATGCCAATGA TTGAAAAGCA AATTACCGAT ATTAACCAAA 3720 

CTAATATTAG TGATAAGCAT GTTAATAATG CAAGGAAAAA CGCAATAGAA ATGTATTACA 3780 

GTCTGCAGAA CTATTATAAT ACACGTATTG AAACAATAAA GGTTAGTGAG AAGTTATCAm 3840 

10 

AAGTCGATGT AGATAAGTTG CCGAAAAAGG GTATAGATAT AACTCACGGC GATAAAGCCT 3 900 

TTGAAAAAAA GCTTGAAAAA TTAGAAGAAA AATAACTATA ATCATTTTTC AAAGTTAAAA 3960 

ATTTTGAATT TATGGTTAAC ATGTCAACTT ACTATGTGTA TAATGGTAAA CATTGATATT 4020 

15 

AACTATATGT ATAAAAATGT CACGCAGATG CTATTTAAAT GTGATAAATA TTTTTAGAGG 4080 

TGAATAGAGT GGCTATAAAG CTAAGTTCAA TTGACCAATT TGAACAGGTT ATTGAGGAAA 414 0 

20 ATAAATATGT TTTTGTATTA AAACATAGTG AAACTTGTCC AATATCGGCA AATGCGTACG 4200 

ATCAATTTAA TAAATTTTTA TATGAACGCG ATATGGACGG TTATTATTTG ATTGTCCAAC 4260 

AAGAACGCGA TTTGTCAGAT TATATTGCTA AAAAAACGAA CGTTAAACAT GAATCACCTC 4320 

25 AAGCATTTTA TTTTGTAAAT GGTGAAATGG TTTGGAATCG AGACCACGGT GATATCAATG 43 8 0 

TGTCGTCATT AGCACAAGCA GAAGAATAAT GAAACTATAG GGTTGGAACA TTTTGCCTTA 4440 

CACTACTAGA CGTGAATAGC ACAACTTAAA TTCGTGTGAA TCAGAGTAGT TTGGCTATAA 4 500 

30 TGATGTTCTG ACCTTTTATT TTATGTCACC TTTAGAAGCA GTTAAGTTAG TACTTTTTTA 4560 

CAAACATATG TATAATATAT TCGAGTATTT TTATTGAAAa tATTTTGGAA AACGACGAAT 4 620 

CCAATAAGAA AATTTAAACA TGATTTGTAA GTTAGTTTAA TAGGAAATAT ATGCTAAACC 4680 

35 

AAAAGAAGCA TATTGTTATT TACTGGAATA ATTAATAATC ATGTCATGTT AAATGTTAGC 4740 

ATATAATCAC GAGATAAAAT CTAAAATTTA AGATTAATCT TTTATGAATA AAAAACGTAT 4800 

CACAACAAAT AATAAAGTAA GGTGGTCAAG GTTATGAAAG TATTAGTAGC CATGGATGAG 4860 

40 

TTTCATGGAA TTATTTCAAG TTATCAAGCT AATAGATATG TTGAAGAGGC AGTTGCAAGC 4920 

CAAATTGAAA CTGCAGATGT AGTTCAAGTA CCATTGTTTA ATGGAAGACA TGAATTATTA 4980 

45 GATTCTGTAT TTTTATGGcm ATCTGGGcaA AAGTATCGTA TACCAGTACA TGATGCAGAT 504 0 

ATGAATGAAG TTGAAGGTGT TTACGGACAA ACTGATACAG GGATGACCGT TATCGAGGGG 5100 

AATTTATTTT TAAAAGGTAA AAAACCAATT GTTGAACGAA CAAGTTATGG TTTAGGAGAA 5160 

50 ATGATTAAAC ATGCATTAGA TAACGACGCA AAACATGTTG TAATTTCACT AGGTGGGATT 5220 

GATAGTTTTG ATGGTGGTGC AGGTATGTTA CAAGCATTAG GTGCTCAATT CTATGATGAC 5280 
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GATATGTCGA ACTTACACCC TAAAATGGAA ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 
TCAAGTCGAT TATATGGTAA GCAAAGTGAA ATCATGCAAA CTTATGATGC GCATCAGTTG 
AATCATAATC AAGCAGCAGA AATCGATAAT TTAATTTGGT ATTTTAGTGA GTTATTTAAA 
AGTGAATTGA AAATTGCAAT TGGTCCAGTT GAACGTGGTG GTGCTGGTGG TGGAATTGCA 
GCAGTCTTGA ATGGACTGTA TCAAGCTGAA ATATTAACCA GTCATGCATT AGTAGACCAA 
CTAACACATT TAGAAAATTT AGTTGAACAA GCGGATTTAA TTATTTTTGG AGAAGGATTA 
AATGAAAATG ATCAGTTGCT AGAAACGACA ACATTGCGTA TTGCAGAACT TTGTCATAAA 
CATCAAAAGG TTGCCATTGC AATTTGTGCA ACTGCTGAAA AGTTTGATTT ATTTGAATCA 
CAAGGGGTTA CAGCAATGTT TAATACATTT ATCGATATGC CAGAAACTTA TACTGACTTT 
AAAATGGGtT ACAAATTAGG CATTATACGG TTCAGTCTTT AAAACTGTTG AAAACACATT 
TTAATGTTGA GGTTTAGTAA AGAAGGACTA AATTGGTGAT GCTGTCATGA TGGTTAATAA 
CATTTATGAT GGTTAGCAAA ACGAATTAGA AGATCGAAAG TATACGTAAA AAATATGAAA 
AATCACGCTA TCATTGCACT GAATGTTAGC GTGATTTTTA TATATTAATT AAGCCTGAGT 
TGAACTAGTA TATAATCGTT GGTTTTTAGT GATTTTCAGC GATATCTTCT ACAATTCCAA 
TGATTACTTG TACTGCTTTT TCCaTAACAT CAATGGATGC aTATTCATAT GGGCCGTGGA 
AGTTACCGCA ACCTGTAAAG ATGTTTGGAG TTGGTAACCC CATAAATGAC AATTGTGAAC 
CATCTGTACC ACCGCGAATA GGTTCAGTGT TTGCTGGAAT ATCTAATTTG GCAAAGACAC 
GTTTAGGTAT ATCAATAATA TGAGGCAATG GTAATATTTT TTCTGCCATA TTGAAATATT 
GATCCGATAT ATCAACTTTA ACTGGATAAT TTTCAAAATG GGCATTGATA TCGTCACGTA 
TTTCTAAAAT ACGTTTCTTA CGCAATTCGA ATTGTTTTTT ATCATGATCA CGAATAATGT 
ATTGCAAAGT TGCTTTTTCA ACAGTTCCTT CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 
ATCCTTCTGT TCGCTCCGGA ACTTCACTAT CAGGTAGCAA ACTATCGAAT TGTTCACCTA 
AACGTATTGC GTTTACCATT GCATTTTTAG CTGAACCAGG ATGAACATTT ACACCGTGGC 
ATGTAATAAC CGCTTCAGCA GCGTTAAAGC TTTCATATTG TAATTCTCCA TATTGACTAC 
CATCCATAGT ATAAGCAAAA TCAGCATTGA AGCGGTCAAC ATCAAATTTA TGTGGACCAC 
GACCGATTTC TTCGTCTGGT GTAAATCCAA TGCGAATGGT ACCATGTTTA ATTTCTGGAT 
GTTCTTGTAA ATAACAAATA GCTTCCATAA TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 
CTAGTAACGA TGTACCATCA GTTACCATTA ATGTATGACC AACTAAACTG TTAAGTTCTG 
GAAATACTTT AGGATCTAAG ACACGTTTAG TATTGCCTAG TTTGTATGGC TTACCATCAT 
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GCGCCAAAAA TCCAACTGTT GGGACGTCGA CATCGATGTT ACTTTCTAAT GTAGCAAATA 7200 

AGTAGCCATT TTCATCTAAA TCAGTTGGCA ATCCTAATTG TTGTAATTCT TTTTCTAATA 7260 

AATGTAACAA ATCCCATTGC TTTTCAGTTG AAGGTGTTGT TGTAGATTTT GGATCAGATT 7320 

GCGTATCAAT TGTCGTATAT CTTGTTAATC TATCTATCAA TTGGTTCTTC ATTATATTCG 73 BO 

ACCCCTTAAA CTCTATTATT CATGTTGTAA GATTTTTTAT ATGTCTTACC TTTGATTTTA 7440 

CCATACAGTT GTTTGATACG TGTGTATAGG TAATATAGAA TTTCAGAAAC TAATATACCG 7500 

AAAGCAATCG CACCTGAAAT CAGTGTAcTT CTAAAAATGT ATTTACAGCA CTTGTATAAT 7560 

CATTTGATAC TAAAAAACGA GTCGCTTGAT AAGCTGCACC ACCAGGTACT AATGGTATAA 7620 

TGCCTGGCAC TATGAATATA ATTACCGGTC GTTTATATCT GCGACTCATA GTATGACTCA 7680 

TTAAGCCTAA AATTAAGCTT CCCAAAAATG AAGCGCCAAC TTTTCCAAAC TCTAAATCTA 7740 

CCGTTAATTG GTAAATCGTC CATGCAATGG CACCCACAAA TCCACATGCT ACTAAGAGGC 7800 

. GTTTGGGTGC ATTGAAAATG ATAGAGAAAA GTACTGTTGA TATAAAGCTG ATTGTAAAAT 7860 

GAAATAAATA AAATAGCATG CTTTAACAGT CCTTCCTTAA ATGATTAATA AAACGATTGC 7920 

25 GACACCAGCA CCGATTGCGA ATGCTGTTAA TGCAGCTTCA ACACCGCGAG ACATACCTGC 7980 

AAGTAATTCA CCCGCTAATA AATCTCGAAT GGCATTGGTA ATTAATATAC CAGGGACAAG 8040 

TGGCATGACA CTGGCTATAG TAATGATATC TTGATTGGTT GCAATGCCTA ATTTAGTAAA 8100 

TGTGGCTGCA ATGGATATGA CCACAGCGGC TGCAACAAAC TCTGAGAAAA ATTTAATTTG 8160 

TATATAGCGT tGCACAAAGC TGAATGTTAA AAATGCGGAT CCGCCAGCAA TGACTGCAAT 8220 

CCAACAATCT GATGCGACAC CACCAAACAT AAATAGGAAG AAGCCACATG CAATGGCAGC 82 80 

TGCAAAGAAA TTCGTTAAAA AAGAATATTG TAATGATGCA TGCTGTAAAT GAATAAATTC 8340 

AGATTTAGCT TCATCAATTG TGAGTTCTTT ATTTGATATT TTACGTGAAA GACTATTCGT 84 00 

A 

TAAAGCGATT TTCTCTAAAT CTGTTGTACG CTCTTGTACA CGAATTAATC TTGTACTTGT 84 60 

TCGATCGTTT AATGAAAAAA TAATTGCAGT TGAACTGACA AAACTATATG TATTATGAAG 8520 

ACCATAACTA TGTGCGATAC GGTTCATTGT ATCTTCAACT CGATATGTTT CAGCACCTGA 8580 

45 TTCaAGTAAA ATTCTACCTG CAATTAATAC AACATCAATC ACTTTGTTTT CATCTATAAT 8640 

TGTGATTGAA TCTGGCATAT CAATTCACCT CCAATGATAT GTGTTATTTA TTTGAACAAT 8700 

TGaAGTTTAC AACTTGTTGT TACAACTTTC AATAGTGAGA CTTTGTGTTA GTATGATGAA 8760 

50 CTTGTATGGT TCAAATTTAA ATAAGAAAAA CTGTTAATCT TTGCTATTAT ACTATGATTT 8820 

AATAATAGCA AAGGATTAAC AGTTTTGTCG TTGTTATAAA TTGATAATAG GGTTAAACAT 8880 
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TTTACGCTGT 


GATTTTGGAT 


CGTCATCTGT 


TAAATAACCA 


ACACCGATAG 


ACACTGACAA 


9000 


TTTAATAACT 


TCTTTGTTTG 


GTAAATGGAA 


TGATGATTTT 


TCAACACCCG 


AACGAATATT 


9060 


TTCAGCTAAT 


TTAACACTTT 


GATCAAGTGA 


ATAATTGTGA 


ATGACAACTG 


AGAACTCTTC 


9120 


GCCACCATTT 


CTAAAAATTT 


TAAATTGATT 


CGGCACATAG 


TTTTTAAGTA 


ATTGAGACAT 


9180 


TTGTTTTAAT 


ACAGCATCAC 


CTGATTTGTG 


TGAGTAGGTA 


TCATTGaCAT 


CTTTAAATCC 


9240 


ATCGATATCG 


ATTAATAATA 


ATGCGATACT 


TTGATGTTCT 


TTTTCAGCTT 


TTCGTGAAAT 


9300 


TTCATTTAAA 


TGTCTATCAA 


ATTCTTTTAC 


ATTACCTAAG 


CCTGTTAAGT 


AATCATATTT 


9360 


ATCTTCGTTT 


TCATAACGAT 


TTACGAGTGA 


GAAGAAATGC 


CAAATATCGA 


CAAATGTTAT 


9420 


CGCTGAAGCT 


AAAGTGATAA 


TTAATGAAAT 


TGGTATTAAA ATGATAACTT 


CCGATAGTGT 


9480 


GTAAATAGGA 


CTCACTAACG 


CGACACCAAA 


TAAAATGATT 


ATTGTAACAA 


CATTAAGTAT 


9540 


TAATAATGAT 


AGCACATCAT 


TTTGTTTTAA 


AAATGGTCCA 


ATAGCACTTG 


TTACTGCAGC 


9600 


AATAACAATC 


AACGTAACAC 


CGTACATAAT 


CGAGTTGTTA 


AATACTACAA 


TTTCAACAAT 


9660 


TGCTACAATT 


ACTGTGGCAG 


ATAATGTATA 


GACCATATTT 


GTAAATCTAC 


CTAAAAACAA 


9720 


TAAAGGAACG 


AATGTTAAGT 


GAATTAAATA 


ATCTTCACGA 


TAAGGGATAG 


GGTAGACAGA 


9780 


TAATAATAAT 


GATACGATTG 


TCATTAAAAC 


AGTGACATAA 


GCCTTAGAAA 


AAAC 


9834 



(2) INFORMATION FOR SEQ ID NO: 38: 

on 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 

35 



~(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



40 



TCTCAATCAG 


ATGAAAAATT 


GCATATCGTA 


GGTTTTACAG 


AAAGTGCAAA 


ATATAATGCG 


60 


TCATCAGTCA 


TTTTCACGAA 


TGACGCTACC 


ATTGCCAAGA 


TCAATCCTAG 


ATTGACTGGA 


120 


GATAAAATTA 


ATGCAGTTGT 


TGTACGTGAT 


ACAAATTGGA 


AAGACAAAAA 


ATTAAACCAA 


180 


GAGCTTGAAG 


CGGTAAGTAT 


TAATGACTTT 


ATTGAAAATT 


TACCAGGTTA 


TAAACCACAG 


240 


AACTTAACAT 


TAAACTTTAT 


GATTTCATTC 


TTATTTGTCA 


TTTCAGCTAC 


AGTTATAGGC 


300 


ATTTTCCTAT 


ATGTCATGAC 


ATTACAAAAG 


ACGAGTTTAT 


TTGGCATATT 


AAAAGCTCAA 


360 


GGATTTACGA 


ATGGCTATTT 


GGCGAATGTG 


GTAATTTCGC 


AGACGGTCAT 


ATTAGCACTA 


420 


TTTGGTACGG 


CATTTGGCTT 


ACTGTTAACA 


GGCGTTACAG 


GTGCATTTTT ACCTGATGCA 


480 



55 



339 



10 



IS 



20 



EP0 786 519 A2 

TCTGTATTAG GAAGTTTATT CTCCATTTTA ACAATTAGAA AAATAGATCC GTTAAAGGCG 600 

ATTGGGTAGG AGGTGTAGCA AATGTTGAAA TTTGAAAATG TAACAAAGTC ATTTAAAGAT 660 

GGGAATCGTA ACATTGAAGC GGTTAAAGAT ACAAATTTTG AGATAAATAA AGGTGATATT 720 

ATAGCATTGG TTGGACCTTC TGGCTCTGGT AAAAGTACAT TTCTAACTAT GGCAGGTGCT 780 

TTACAAACAC CGACATCTGG GCACATTTTA ATCAATAACC AAGATATTAC GACAATGAAG 840 

CAAAAAGCAT TGGCAAAAGT TAGAATGTCT GAAATAGGTT TTATTTTACA AGCTACAAAC 900 

CTTGTACCAT TTTTAACGGT AAAGCAACAA TTTACATTAT TGAAAAAGAA AAATAAGAAT 960 

GTTATGTCTA ATGAAGACTA TCAGCAACTT ATGTCACAAT TAGGTCTAAC TTCATTGCTT 1020 

AATAAGTTAC CTTCAGAAAT TTCAGGTGGT CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 1080 

TATATACGAA TCCGTCGATT ATTTTAGCGG ATGAACCTAC CGCGGCGTTA GATACTGAAA 1140 

ATGCGATTGA AGTCATTAAA ATTCTACGTG ATCAAGCCAA ACAAAGAAAG AAAGCATGTA 1200 

TTATTGTTAC ACATGATGAA CGACTTAAAG CATATTGTGA TCGTTCATAT CATATGAAAG 1260 

ATGGCGTCCT" TAATCTTGAA AATGAAACAG TAGAATAGTT TTATTAAGCC GGTACATCAT 1320 

25 GTGCCGGTAT TTTTATGTTT ATGTATTATT TGAATAAACT TTCACATTCA ATTAATAATA 1380 

ATTATTATCG AAAATCAGAA ATATTCCGTG AAATATAATA TTTTTTGTAG TAAAATGGCC 1440 

TCTAAGTATT CAATATTTAA ATATGGGGAT TGAATATAAA ATTATCGTAA TGGGGGTCAA 1500 

30 TGGTTATGGA TTTATTGATA GGTACTTTAT TTTTATTTTT GGTCTTAGTG ATTTTTACAT 1560 

TATTTACATA TAAAGCGCCT AATGGTATGC GTGCCATGGG AGCATTAGCT AATGCAGCAA 1620 

TCGCAACATT TTTAGTGGAA GCATTTAATA AATATGTTGG TGGCGAAGTA TTCGGTATTA 1680 

AATTTTTAGA AGAGCTAGGA GACGCTGCGG GAGGTCTAGG TGGTGTCGCT GCCGCTGGAT 1740 

TAACAGCATT AGCTATCGGT GTGTCACCAG TATATGCATT AGTTATAGCA GCCGCGTGCG 1800 

GTGGTATGGA TTTATTACCA GGTTTCTTTG CGGGTTATAT GATTGGATAT GTGATGAAAT 1860 

ATACAGAGAA ATATGTGCCG GATGGTGTCG ACTTAATTGG ATCGATTGTC ATCTTAGCGC 1920 

CATTAGCTCG TCTTATTGCA GTATTATTAA CGCCAGTAGT GAATAGTACA TTGATTCGAA 1980 

TTGGTGATAT TATCCAAAGT AGTACGAATA CGAATCCAAT TATCATGGGT ATCATTTTAG 2040 

GTGGTATTAT TACGGTTGTC GGCACAGCGC CATTGAGTTC AATGGCATTG ACAGCATTAT 2100 

TAGGTTTAAC GGGTGTACCT ATGGCTATTG GTGCCATGGG AGCATTTAGT TCGGCATTTA 2160 

50 TGAATGGGAC GCTATTCCAT CGCTTAAAAT TAGGTGATCG TAAGTCTACG ATTGCAGTAA 2220 

GTATTGAACC TTTATCACAA GCAGATATTG TATCAGCCAA TCCAATTCCA ATCTATATTA 2280 
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ATGCGACAGG TACAGCTACA CCGATTGCAG GATTTTTAGT TATGTTTGGA TTTAATCATC 2400 

CGACGACAAT TGTGATTTAT GGTGTAGTAA TGGCGATTGT AGGTGCGCTT GCAGGTTATC 2460 

TTGGTTCAAT TGTATTTAAA AAATATCCAA TTGTTACTAA GCAAGACATG ATTAATCGAG 2520 

GTGCAGTAGA CG CAT AG CAT CATCATATTG AATAGTAAAA ACAAATAAAA CATAGTAACG 2580 

TGATTCAGTC GATGTAACAG TCGATAATGA GTCACGTTTT TTTATAGAAA AATACAAGAC 264 0 

ATAAAAATGT CATAATTTAT TGTCGACAAA TATCATACTG TATAAACATT TATCATTTTC 2700 

TCAAGTACCT TTTACACGAT GGAATGAACT TACTTTTTAC GAAATTATGC GTATTTTATA 2760 

AACAAATATC ATTGATATAA CGGTAAATGT AAGCGTTTAC AACAGAAATA ACAGCATGCT 2820 

ACGATATTTT TGTAAATTCA CTGATTCAAG TATTTTAAGT CAATATGAGG AGGGATGTTA 2880 

TGAGCGATTC TGAGAAAGAA ATTTTAAAAA GAATTAAAGA TAATCCGTTT ATTTCACAAC 2940 

GTGAACTTGC TGAGGCAATT GGATTATCTA GACCCAGCGT AGCAAACATT ATTTCAGGAT 3000 

TAATACAAAA GGAATATGTT ATGGGAAAGG CATATGTTTT AAATGAAGAT TATCCTATTG 3060 

TTTGTATTGG CGCAGCGAAT GTAGATCGTA AGTTTTATGT GCATAAAAAT TTAGTTGCAG 3120 

25 AAACATCAAA TCCTGTAACG TCAACACGCT CTATTGGTGG CGTAgCAAGA AATATTGCTG 3180 

AGAACTTAGG TAGGCTTGGC GAAACGGTCG CTTTTTTATC TGCTAGTGGA CAAGATAGTG 3240 

AATGGGAAAT GATTAAACGA TTGTCCACAC CATTTATGAA TTTGGATCAT GTTCAACAAT 3300 

30 TTGAAAATGC GAGTACAGGT TCATATACAG CTTTAATTAG TAAAGAAGGC GACATGACAT 3360 

ATGGCTTaGC AGATATGGAA GTGTTTGACT ACATTACGCC TGAATTTTTA ATTAAGCGTT 3420 

CACACTTATT GAAAAAGGCT AAGTGCATTA TTGTAGATTT GAATTTAGGC AAAGAGGCAT 34 80 

TAAACTTCTT ATGTGCCTAT ACCACGAAAC ATCAAATCAA ATTAGTTATC ACCACGGTTT 3540 

CTTCCCCAAA AATGAAAAAT ATGCCTGATT CATTACATGC TATTGATTGG ATTATCACGA 3600 

ATAAAGATGA AACAGAAACA TACTTAAATT TAAAAATAGA ATCTACTGAT GATTTAAAAA 3660 

TAGCTGCTAA ACGCTGGAAT GATTTAGGTG TTAAAAATGT TATTGTGACA AATGGCGTGA 3720 

AAGAACTCAT TTATCGAAGT GGTGAGGAAG AAATCATTAA GTCAGTTATG CCATCAAATA 3780 

GTGTGAAAGA TGTTACAGGT GCAGGCGATT CATTCTGTGC TGCAGTAGTG TATAGCTGGT 3840 

TAAATGGGAT GTCTACTGAA GATATATTAA TTGCTGGTAT GGTTAACGCA AAGAAAACGA 3900 

TAGAAACGAA ATATACAGTT AGGCAAAACC TAGATCAACA GCAACTTTAT CACGATATGG 3960 

AGGATTATAA AAATGGCAAA TTTACAAAAG TATATTGAGT ATTCTCGAGA AGTTCAGCAA 4 020 

GCACGGGAGA ACAATCAACC GATTGTAGCA TTAGAATCAA CAATTATTTC GCATGGTATG 4080 
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GCCATTCCAG CAACCATAGC CATTATAGAT 
GATTTAGAAA TACTGGCAAC TAGTAAAGAC 
5 GAAGTTATTG CGATGAAGTG TGTTGGTGCT 

GCAATGGCTG GTATTCAATT TTTTGTTACA 
GAACATACGA TGGACATTTC AGCAGACTTA 

10 

ATCTGTGCAG GTGCCAAATC AATTTTAGAC 
AAAGGCGTTC CAGTTATTGG ATATCAAACG 
AGCGGTGTTA AGTTAACAAG TTCGGTTGAA 

15 

ACAAAACAGC AGTTAAATCT TGAAGGTGGC 
CATGCCTTAT CAAAAGCATA TATTGAGGCA 

20 AATCAAGGTA TTAAAGGTAA GGACGCCACA 

ACGAATGGTA AAAGTTTAGC AGCAAATATA 
GCTAAAATTG CTGTCGCTGT TAATAAATTA 

25 CGCTATCACA GGGATAGCAT TTGCACTATT 

AAAAATAGAC TTCAAAAAGA CGTTAATAAT 
TATGATGAAC ACAACGATTG GTTTGACAAT 

30 GCTAATAAAT ATTAGTAAAG CAGGCATAAA 

TGGCTTTACG TTCTTTTTAA ACGTATTACT 
CATCTTTAAT TAXAXTAAGG TATTACCATT 

35 

TAAAATAACT AGAATGGGGC GCTTAGAAAG 
GCAACCAGAA GTATATTTAA CAATAAAAGA 
ATATACAATT GCGACGTCTG GTATGAGTGC 

40 

GCAGATGATT GAACCCAAGT TCGTAGTTAC 
TATCATCGCC AGTGTAATCA ATCCCTATAA 

45 CTTAACGAAA TCCACAGAAA CTAAAACATT 

TGCCTTTTTC CAAATGATTG GTGATAGTGC 
AGCCGTAATG TTGTTAGCAT TTATTTCATT 

50 TGTTGGTTTG AACTTTAAAC AGCTTATTGG 

GGGGATTCCA TGGAGCGAAC TGTTCCAGCT 

55 



GGCAAAATTA AAATTGGTTT AGAAAGCGAA 4 200 

GTTGCTAAAG TATCTAGAAG GGATTTAGCA 4260 

ACTACTGTAG CGACGACGAT GATATGTGCT 4320 

GGAGGTATTG GGGGCGTCCA TAAAGGTGCA 4380 

GAAGAACTGT CTAAAACAAA TGTCACTGTT 4440 

TTACCTAAGA CGATGGAGTA TTTAGAAACA 4500 

AATGAATTGC CAGCATTCTT CACTCGCGAA 4560 

ACGCCAGAAC GACTTGCTGA CATTCATTTA 4620 

ATTGTTGTTG CTAATCCAAT TCCATATGAG 4680 

ATCATAAATG AAGCTGTTGT TGAAGCGGAA 4740 

CCGTTCTTGT TAGGGAAAAT TGTAGAAAAA 4 800 

AAACTTGTTG AAAACAATGC GGCGTTGGGT 4 860 

TTGTAGGTGA TGATACATGA ATATTTTATT 4 920 

TGTTGCGTTT TTATTCAGTT TTGATCGTAA 4 980 

GATATTTATT CAAGTGTTGA TCGTGTTATT 5040 

TTTAACTGCA CTAGGTTCAT TTTTTGAAGG 5100 

TTTTGTTTTT GGAGATATAC AAAATAAAAA 5160 

GCCATTAGTT TTTATTTCTG TATTAATAGG 5220 

TATTATCAAA TATGTAGGTA TCGCTATTAA 52 80 

TTATTTTGCT ATTTCAACAG CAATGTTTGG 5340 

TATTATTCCA AGATTATCTA GAGCGAAATT 54 00 

TGTTAGTATG GCAATGCTAG GTTCATATAT 54 60 

AGCAGTAATG TTAAATATTT TTAGTGCGCT 5520 

ATCTGATGAT ACTGATGTTG AAATTGATAA 5580 

GAATGGAAAA ACAGGAAAAC CTAAGAAAGT 5640 

GATGGATGGG TTTAAAATCG CTGTTGTAGT 5700 

AATGGAAGCA ATTAATATCA TGTTTGGTAG 5760 

CTATGTGTTT GCACCAATCG CATTCTTAAT 5820 

GGCTCTTTAA TGGCGACTAA ATTAATTACA 5880 
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CAAGGTATCA TTTCAGTTTA CTTAGTAAGc TTCGCTAATT TTGGTACGGT TGGTATCATC 6000 

GTAGGTTCAA TTAAAGGCAT TAGTGATAAA CAAGGAGAAA AAGTTGCATC CTTTGCAATG 6060 

AGGTTGCTAC TTGGTTCAAC TCTAGCTTCA ATCATTTCAG GATCAATCAT TGGCTTAGTA 6120 

TTGTAAATGA ATCGAAGTAC CTAAATTAAA TTCATGGCAA AGCTAAACCC CGTCACCAAG 6180 

TTGGCGCAAC AGCGcATgcA TAACTTAGTG ACGGGGTTTT ATCATAACAA TCTACTTTTT 6240 

CGTAGCCGTT TTTGAAATGT ATGTTGATGG TTTATCTTTT TCAAAAATTG TTAATCCCGT 6300 

TATATCTTTT TTATGTTTTG AAGGGACAAT GAAGCTAAGT ATATAAGCAA AGACAAAAGC 6360 

1S AACTGTAAAT GAAATGGTAG ATACATAGAA AGGTGAGTTA CCTTTGCCAA CACCATTATA 6420 

GACATAAGCA AAGATGATAC CCAATATTAA TCCACAAATA ACACCGAATG TATTCGTACG 6480 

TTTAGTGAAA ATACCAACTG CAAATACACC AGCCAATGGA ACGCCGAATA ATCCAGTCAC 654 0 

20 AAACAAGAAT AAATCCCATA AGTCATTTGA ATTAGAAGCA ATTAAGTATA GTGACATTCC 6600 

AAAACCGAAA ATACCTGCAA TGATAATAAT GAAACGTGCA AAGTTAACTT CGTGTCGCTC 6660 

GCTACCTTTT CCGAAGAAGC GTTGCTTAAT GTCGATTGAA ATACAAGCAG ATATAGAATT 6720 

TAAACTAGAT GAAATGGTAG ACTGTGCAGC GGCGAAAATG GCTGCAATAA GTAATCCTGC 6780 

TACAAATGGT GGCATCTCAG TCAAAATGAA ATATGGCACT ACAGATGATG TATTGAAGCC 6840 

TTTTGGTAAA ACAGCTTCAT GTGTATAAAA TGAATACAGC ATTGTACCCA TACCATAAAA 6900 

TAAGGGTGCT GAAATTAAAG CTAGGATACC ATTTGTCCAT AACGATTTAT TTGTTTCTTT 6960 

TAAACTATCA GAAGCTTGAT AACGCTGCAC GACGTCTTGA CTCGCTGTGT ATTGATACAA 7020 

GTTGTTGAAA ATATTTCCTA GGAAAATAAT TGGAATGGCA GCTGCCGCAG TATTTAGTTT 7080 

CCAATTGTCT GCACTAATTA ATTTTTTGTG CTCAATCGCA TCTGCAAAGA CAGTGCCGAA 7140 

ACCGtCTTTA ATGTTCACAA CACCTAGAAT AATAATAACT AAAGCGCCGC CTAATAAAAT 7200 

4Q GACGCCTTGA ATGAAATCAC TCCAAACCAC ACCTTCGAAA CCACCTAAAA ATGTATATAA 7260 

AATACATAGT AAACCAACGA GTGATGCAAC GATATAAGGG TTCATGTCTG ATACAGATGT 7320 

GATTGCTAAT GTTGGTAAGT AGATAACAAT TGCAACACGC CCTAAATGGT AAACGACAAA 7380 
45 TAATAATGAG CCAATGACAC GTATGCTAGG GCCAAATCTA GCTTCTAAAT ATTCATATGC . 7440 

AGATGTTACC TTTAACTTTT TAAAGAAAGG GACATAGAAA TAAATAAGTA ATGGAATAAT 7500 

TGCGACGATA GCAATGTTAC CAGCGATATA TGACCAATCT GTTAAAAATG CTTTCTCTGG 7560 

TGTCGACATA AATGTAATCG CACTTAACGT AGTAGCATAA ATTGAAAAGC CAACTACCCA 7620 

AGATGGCAAG CGACCACTTG CGGTAAAGAA ACTATTGGTA CTTTGGCTCG CGCGCTTGGT 7680 
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TGTGCCAAAT 


CCAACTTCTT 


TCATGGGCAA 


CATCCCCTTT ACAATGTATT 


GATTCTTTGA 


7800 




TGTCTATAAA TCGTATTTTG 


CAATGAGTTG 


ATCTAATGTT 


TGTCGATGTG 


CTTCGTTAAA 


7860 




AGGTTTGAAA 


GGTCTTTTCG 


GTAATCCTGC ATCAATGCCA CGATGACGTA ATATTTCTTT 


7920 




CAATGTTGGA TAAATCCCCA TTGATAACAC 


TGTTTCGATA 


ATGTCGTTTG 


AATCATGTTG 


7980 


10 


CAGTTGGTAA 


GCTTCTTGAA 


TTTGACCTTG 


TCGTGCTAAG 


TCGAAGATTT 


TTCTTGCACG 


8040 


GCGACCATTA 


ACGTTATATG 


TAGAACCAAT 


TGCACCATCT 


ACGCCAGAAA 


TCGTAGCTTG 


8100 




AACTAACATT 


TCATCAAAGC 


CAGATAAGAT 


TAATTTGTCT 


GGGAATGCTT 


TTCTAATACG 


8160 


15 


TTCGAGTAGG 


AAGAAGTTTG 


GCGCTGTATA 


TTTAACACCA 


ACAATTTTTT 


CATGATTAAA 


8220 




TAGCTCGCTG 


AATTGTTCAA 


TAGAAATATT 


CACACCTGTT 


AAATCTGGTA 


TTGCATAAAT 


8280 




AATCATATTG 


TTCTGAGTTG 


CTTCGATAAT 


ATCGAAATAG 


TAATCTCTAA 


TTTCTTCAAA 


8340 


20 


AGTAAATGGA 


TAGTAGAATG 


GTGTTACGGC 


AGAAAGTGCA 


TCATAACCGA 


GTTCTGTGGC 


8400 




ATATTTTCCA AGTTCAATGG 


CTTCATTTAA 


ATCTAACGAA 


CCTACTTGAG 


CAATCAATTT 


8460 




CACTTTATCC 


CCAACTGCCT 


CTTTGGCAAC 


CTTGAAAACT 


TGCTTCTTCT 


GCTCTGTATT 


8520 


25 


TAATAAAAAG 


TTTTCGCCTG 


AGCTACCATT 


TACATAAAGA 


CCGTCTAATT 


CTTCAGTTTC 


8580 




AATGGCATTT 


TGAGCAATTT 


GTTTAAGTCC 


TTGTTCATTT 


ACTTGACCAT 


TTTCATCAAA 


8640 




AGGAACGAGT 


AACGCTGCAT 


ATAAACCTTT 


TAAATCTTTG 


TTCATTATGA 


AGTCCCTCCA 


8700 


30 


AAAATCATTT 


GATAATATAG 


TTTACAGCTA 


TAATTGTAAA 


CGCTATCATA AAATGTAACA 


8760 




ATATCTTTTT 


GAAAATTGTA 


GTCATATTTA 


TGTATAATTA 


ATGAAAATGT 


TTTTCAAAAT 


8820 




CAATAGAAAT 


GGAGTGAGTA 


AGGTGTATTA 


CATCGCAATC 


GATATTGGAG 


GCACTCAAAT 


8880 


35 


TAAATCGGCA 


GTTATTGATA 


AGCAATTGAA 


TATGTTTGAC 


TATCAACAAA 


TATCAACGCC 


8940 




GGACAACAAA 


AGTGAGCTTA 


TTACTGACAA 


AGTATATGAG 


ATTGTAACAG 


GATATATGAA 


9000 


40 


GCAATATCAG 


TTGATCCAAC 


CTGTCATAGG 


TATTTCATCA 


GCAGGCGTTG 


TTGATGAACA 


9060 


AAAAGGCGAA 


ATTGTATACG 


CAGGGCCAAC 


CATTCCGAAT 


TATAAAGGTA 


CTAATTTTAA 


9120 




GCGATTATTA 


AAATCACTGT 


CTCCTTATGT 


CAAAGTAAAA 


AATGATGTAA 


ACGCTGCATT 


9180 


45 


ACTAGGCGAA 


TTGAAATTAC 


ATCAATATCA 


AGCAGAACGG 


ATCTTTTGTA 


TGACGCTTGG 


9240 




TACAGGCATT 


GGGGGTGCGT 


ACAAGAATAA 


TCAAGGTCAT 


ATTGATAATG 


GTGAGCTTCA 


9300 




TAAGGCAAAT 


GAAGTTGGGT 


ATTTATTGTA 


TCGTCCAACT 


GAAAATACAA 


CGTTTGAGCA 


9360 


SO 


ACGTGCTGCA 


ACGAGTGCAT 


TGAAAAAGCG 


CATGATTGCC 


GGAGGATTTA 


CGAGAAGCAC 


9420 




ACATGTGCCA 


GTATTGTTTG 


AAGCAGCTGA 


AGAAGGTGAT 


GATATTGCAA 


AACAAATATT 


9480 
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AGGGCTTATA TTAATTGGGG GCGGTATATC TGAACAAGGA GATAATCTCA TTAAATATAT 9600 

CGAGCCGAAA GTTGCACACT ATTTACCAAA AGACTATGTT TATGCACCAA TACAAACGAC 9660 

5 

TAAGAGTAAA AATGATGCAG CATTATATGG CTGTTTGCAA TGATAGTTGA AAGAAGGAGT 9720 

CATTCTAAAA TAGAATTTGA AACCGTTACG AGAGATGAGA GCTGTTGTTA GTTCCACACA 9780 

TCACACTCTA TCTAGGACCA ATCTAAACTA TATCAACCAA CAGTGTGCCA CGGGCAAATT 9840 

10 

AAATTGAAGA AGCTGAGATA TTAAAATTTT AGAAAATGTA AAAAAATATT TGGTATTGAA 9900 

ATTAAAAAAG CACCTAGCAA CTCGTTGGGA CAATCACGAT GATTGTCTAC AGTTGCAGGT 9960 

1$ GGATTTGAAT ATACTACTAG TTATTTGTTG TCTAGGATAA TAGATTTAGT ATGTTGATAA 10020 

GTTTGACTCA GATTCGTATT TTCTAATAAA TGATAACTCA CGATATCGAT TAAAAAGAGT 100 80 

GTCGCAATTT GTGTGTTGAT AAATTGATGG TCGGTATTAC GCGATTGATC CGTTGTTAAA 10140 

20 AGTACTAAAT CTGCACAATC TGTAAGTTTA CTACCTTCAA AATTTGTGAT GGCAACGACA 10200 

TATGCACCAT GAGATTTGGC GACTTCCGCT GCAGAAATTA ATTCCGAAGT ATTACCACTA 10260 

TTTGACATAG CAATAAACAT ATCCGAATGA GATAGTAGGG ATGCCGATAT TTTCATTAAA 10320 

25 TGTGAATCGG TAGTAACATT ACCTTTTAGC CCCATACGAA TCATACGATA ATAAAATTCA 10380 

GTCGCTGATA AACCAGAGCT ACCTAGTCCA GCAAAGAGTA TATGTCGACT TGATTGAAGT 10440 

TTGTCGATAA AGGTTTGGAT AATGTCGTTA TCAATAAATT CACCAGTTTG TTGAATGATT 10500 

30 

TGTTGATGAT ATTTATGAAT TCTTTGAATA ATTGGGCTAT TTTCAATAAC TGTCTCTGTC 10560 

ATTTCTTGTT GAATATTAAA TTTTAAATCT TGGAAATTCT CATAATCCAG CTTATGACTA 10620 

AAGCGTGTCA TCGTTGCTGG TGATGTACCA ATCGCATGGG CTAAGGAGTT AATCGTTGAA 10680 

3S 

AAGGCATCGC TATAACCATT TTGTCTTATA TAATTGACGA TGCGTTTATC AGTTTTTGTA 10740 

AATAAATGTT GATAACGTTG AACACGATTC TCAAATTTCA TTGTGTCACC CCTTCATCTT 10800 

AATGATTACT ATTATATATG AAAAATATTT TCAAGATAGT AAAAAGCATT GATAAAAATT 10860 

40 

ATCTTAATGA TATATTGTAA ATGACTTTAC GTGAAAAAAC GACTTATGGA GTGAGGAATA 10920 

ATGTTACCAC ATGGATTAAT AGTATCTTGT CAGGCACTAC CAGATGAACC ATTGCATTCA 10980 

45 TCTTTTATTA TGTCGAAAAT GGCATTAGCT GCGTATGAAG GTGGTGCTGT TGGTATTCGC 11040 

GCAAATACTA AGGAAGACAT TTTAGCAATT AAAGAAACGG TAGATTTACC AGTTATTGGC 11100 

ATTGTGAAAC GTGACTATGA TCACTCAGAT GTTTTCATTA CTGCAACGTC AAAAGAAGTT 11160 

50 GATGAACTGA TAGAAAGCCA ATGTGAAGTC ATTGCATTGG ATGCAACGTT ACAGCAACGT 11220 

CCGAAAGAAA CGTTAGACGA ATTAGTATCA TATATTAGAA CACATGCACC GAACGTTGAA 11280 
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TATATTGGCA 


CGACGTTACA 


TGGCTATACT 


AGTTATACGC 


AAGGACAATT 


ACTTTATCAA 


11400 


AATGACTTCC 


AATTTTTAAA 


AGATGTACTA 


CAAAGTGTTG 


ATGCAAAAGT 


TATTGCGGAA 


11460 


GGTAATGTCA 


TTACACCGGA 


TATGTATAAA 


CGTGTGATGG 


ACTTAGGCGT 


TCATTGTTCA 


11520 


GTCGTTGGTG 


GTGCGATAAC 


ACGACCAAAA 


GAAATTACGA 


AACGTTTTGT 


TCAAATTATG 


11580 


GAAGATTAAA 


TGATAACGAT 


AAAAAAACGA 


GATGACCATC 


ATTAATTAAA 


GGCACCTAAT 


11640 


TATCTTAGGT 


GGCTGAATGA 


ATGTAATGGG 


TTCATCTCGT 


TTTGTTTGTT 


TATGATAGTG 


11700 


ATTTTATTTT 


CAACTTTATC 


CAAAAATAAG 


TAAAGCGACG 


GGGATGGTGA 


TTAATAGCGA 


11760 


CAACGCCACG 


CGTAAAAACC 


AAATGATGAT 


GAGTTTCCAG 


ACAGGTATTT 


TAATTTCAGT 


11820 


TGCTAGTATA 


CATGGCACTA 


ATGCTGAGAA 


AAAGATAATG 


GCTGATACGC 


TTACTACACC 


11880 


GACGACAAAT 


TTAGTACTCA 


TTGCAGCTTT 


AGTTACTAAC 


AAAGATGGTA 


GAAACATCTC 


11940 


TACAATAGAA AckCTGACGC 


TTTTGCTAGT 


AAAGCCTGAT 


CAGCAATTGG 


GAAAATATAA 


12000 


ATAAATGGAT 


AGAAGATATA 


GCCAAGCCAA 


TCAATGAATG 


GTGTATAGTT 


CGCTACAATC 


12060 


AGTCCTAAAA 


AACCAATCGA 


TAATATAGAA 


GGTAAAATAC 


CAACAGTCAT 


TTCTAAACCG 


12120 


TCTTTCAAAT 


TGTCCCAAAC 


GTTCTTCACG 


AGAGATGGTG 


TTAATGCATT 


TTGTTTCATC 


12180 


GCCTCTGCAT 


ATGCAGTTTT 


CAGTCTGCTT 


CCTTCAATAG 


CAACTTCTTG 


TTCTCCTTCT 


12240 


TGTCCGTTAT 


AATATTCTGT 


TGATTCATTG 


CTGATTGGCG 


GTAGCCATGC 


AGTAATTGCA 


12300 


GTCACGACAA 


ATGTGATGAC 


TAAAGTTATC 


CAAAAGTATA 


AATTCCAATG 


CGGCATTAAT 


12360 


CCTAAAGTTT 


TAGCAACGAT 


AATCATAAAA 


GTTGCTGAAA 


CTGTTGAAAA 


GCCAGTCGCA 


12420 


ATAATCGTGG 


CTTCTCGTTT 


GTTGTACATC 


CCTTGCTTAT 


AGACACGATT 


AGTAATCAAT 


12480 


AATCCTAAGG 


AATAACTGCC 


GACAAACGAA 


GCCACTGCAT 


CGACAGCGGA 


TTTTCCTGGT 


12540 


GTTTTAAAAA 


TAGGTCTCAT 


AATAGGCTCC 


ATATAAACAC 


CGACAAATTC 


TAATAAGCCA 


12600 


TAGCCCACTA ATAAAGAAAG 


CGcAATTGCA 


CCTACTGGAA 


TTAAGATACT 


TAATGGCATC 


12660 


ATTAATTTTT 


CAAACAAAAA 


CGGACCATAG 


TTAGCTTTAA 


ATAGTATTGA 


TGGACCGATT 


12720 


TTAAATACAT 


ACATTATACC 


GATCATTGCA 


CCTGCAACTT 


TAAATAATGT 


AATGACCAAG 


12780 


TTTGTGATTG 


AAGTCATAAA 


AGTACGTCTC 


ACTATTGGTA 


ACGCTGTACC 


AATTAAAATC 


12840 


ATAATCAGTG 


CAACATAGGG 


CATAAGTGGA 


CCTATGATTG 


AGCGAATGGC 


TAGATGAACA 


12900 


TGATCGACGA 


AAATAGTGTT 


GTTACCATTA 


ATCGTAAAAG 


GAATAAAGAA 


ACATAGTATG 


12960 


CCCACTAAAC 


TATAGACAAA 


AAAACGCCAT 


GCACTTGGTT 


GTTGTGCATT 


AGAATGATAT 


13020 


TGATTCATTA 


AAGCAACCCC 


TTTGTTTAAA 


TGAATACACA 


AAACTGTATG 


ATGCATCTTC 


. 13080 
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ATAGTTTGAA 


TTATTTTCAT 


ACCAATACAA 


ATTAACTAAT 


TATATATAGA 


TTGAAACTAT 


13200 




ATTACTTAAT 


AAAATATTTA 


TCTTAAATGT 


TGTTGTGTTG ATTCAACACC 


ACAACTAAAA 


13260 


5 


GTGTTTATAA ATTATTTGGA AATACACATA TTTGTAAATG ATTAGTATCG ATTTAATATC 


13320 




GTATTATTAA 


ATTTTTATTA 


ATTTTGTAGT 


CTTAATCmAA AAATAATATA 


TGTCATGTTA 


13380 


10 


TATTGAAGGT 


GCAGTTGTTT 


TTCATTCTCA 


AGAGGGGGTC 


AAAAAAATAC 


TTTTGAGGTG 


13440 


ATTATATGTT 


AAGAGGACAA 


GAAGAAAGAA 


AGTATAGTAT 


TAGAAAGTAT 


TCAATAGGCG 


13500 




TGGTGTCAGT 


GTTAGCGGCT 


ACAATGTTTG 


TTGTGTCATC 


ACATGAAGCA 


CAAGCCTCGG 


13560 


15 


AAAAAACATC AACTAATGCA GCGGCACAAA AAGAAACACT AAATCAACCG GGAGAACAAG 


13620 


GGAATGCGAT 


AACGTCACAT 


CAAATGCAGT 


CAGGAAAGCA 


ATTAGACGAT 


ATGCATAAAG 


13660 




AGAATGGTAA 


AAGTGGAACA 


GTGACAGAAG 


GTAAAGATAC 


GCTTCAATCA 


TCGAAGCATC 


13740 


20 


AATCAACACA 


AAATAGTAAA 


ACAATCAGAA 


CGCAAAATGA 


TAATCAAGTA 


AAGCAAGATT 


13800 




CTGAACGACA 


AGGTTCTAAA 


CAGTCACACC 


AAAATAATGC 


GACTAATAAT 


ACTGAACGTC 


13860 




AAAATGATCA 


GGTTCAAAAT 


ACCCATCATG 


CTGAACGTAA 


TGGATCACAA 


TCGACAACGT 


13920 


25 


CACAATCGAA 


TGATGTTGAT 


AAATCACAAC 


CATCCATTCC 


GGCACAAAAG 


GTAATACCCA 


13980 




ATCATGATAA 


AGCAGCACCA 


ACTTCAACTA 


CACCCCCGTC 


TAATGATAAA ACTGCACCTA 


14040 




AATCAACAAA 


AGCACAAGAT 


GCAACCACGG 


ACAAACATCC 


AAATCAACAA 


GATACACATC 


14100 


30 


AACCTGCGCA 


TCAAATCATA 


GATGCAAAGC 


AAGATGATAC 


TGTTCGCCAA 


AGTGAACAGA 


14160 




AACCACAAGT 


TGGCGATTTA 


AGTAAACATA 


TCGATGGTCA 


AAATTCCCCA 


GAGAAACCGA 


14220 




CAGATAAAAA 


TACTGATaAT 


AAACAACTAA 


TCAAAGATGC 


GCTTCAAGCG 


CCTAAAACAC 


14280 


35 


GTTCGACTAC 


AAATGCAGCA 


GCAGATGCTA 


AAAAGGTTCG 


ACCACTTAAA 


GCGAATCAAG 


14340 




TACAACCACT 


TAACAAATAT 


CCAGTTGTTT 


TTGTACATGG 


ATTTTTAGGA 


TTAGTAGGCG 


14400 


40 


ATAATGCACC TGCTTTATAT CCAAATTATT 


GGGGTGGAAA 


TAAATTTAAA 


GTTATCGAAG 


14460 


AATTGAGAAA 


GCAAGGCTAT 


AATGTACATC 


AAGCAAGTGT 


AAGTGCATTT 


GGTAGTAACT 


14520 




ATGATCGCGC 


TGTAGAACTT 


TATTATTACA 


TTAAAGGTGG 


TCGCGTAGAT 


TATGGCGCAG 


14580 


45 


CACATGCAGC 


TAAATACGGA 


CATGAGCGCT ATGGTAAGAC 


TTATAAAGGA 


ATCATGCCTA 


14640 




ATTGGGAACC 


TGGTAAAAAG 


GTACATCTTG 


TAGGGCATAG 


TATGGGTGGT 


CAAACAATTC 


14700 




GTTTAATGGA 


AGAGTTTTTA 


AGAAATGGTA 


ACAAAGAAGA 


AATTGCCTAT 


CATAAAGCGC 


14760 


SO 


ATGGTGGAGA 


AATATCACCA 


TTATTCACTG 


GTGGTCATAA 


CAATATGGTT 


GCATCAATCA 


14820 




CAACATTAGC 


AACACCACAT 


AATGGTTCAC 


AAGCAGCTGA 


TAAGTTTGGA 


AATACAGAAG 


14880 
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ATTTAGGATT AACGCAATGG GGCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA 15000 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CATCAGACGA CAATGCTGCC TATGATTTAA 15060 

5 

CGTTAGATGG CTCTGCAAAA TTGAACAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 15120 

CGACTTATAC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG 15180 

GTACATTTTT CTTAATGGCT ACAACGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 15240 

10 

GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAACCAT 15300 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGGCAA GTTAAACCAA 15360 

15 TCATACAAGG ATGGGATCAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

20 GCTATGTAAA TCGTGCTGTT ATCATGGCAC ATCAGATATA AG TAG CAT CA CAGTGTTGAA 156 00 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

25 ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTG TTATAGCGTT 15780 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACGATCAAA 15840 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GGAACATGAT TCATGATATG 15900 

30 

TTCGCTTTCC TCAACGGGAA CATCATAATC GCCATTACAA TGCGCAATGA AAACAGGTGG 15960 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 16 020 

AATCATATTT ATCCATTTAC CTGTGCCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 16080 

35 

GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 16140 

GCTTTGAGCT ATTTTTGCGT AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 16200 

ACTATAACCA TAAAAATCAA TAACACCATC AATATCTCTG TCTCGTGCAA TTAATAGACT 16260 

40 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 16320 

AATCGCATCG AATGAtGCgn AGnACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 16380 

4S ATAAACGATA ACTTAGTTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 16440 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16500 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 16560 

50 AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16620 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 16680 
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CATCATTTTA 
CATAGAAAAT 
AGGTTCTGTG 
AGCAGACTCT 
TTCATAGACA 
TTGATAAATA 
GACCCAATTG 
TTCTTCTGGC 
TTCTTGTACC 
ACCGTAATGG 
TTGTGCAATC 
GCGATACTGC 
AATCGGTGAA 
TTGATTTGAA 
CATCGTAGAT 
ATATTCAAAC 
TGAATTAGAG 
TGATGTGTTT 
TAAGTGGATT 
AATATTGTTT 
AATAGAATTG 
CTGAAAATAT 
AACGTTATCA 
TGTCGGTTTG 
AGACTATGCC 
ACTTGCAAAA 
TGTACCAGCG 
AAATCACATA 
TGCTGAGCAT 



ACAATATCTT 
TCAAGATTGA 
ACAAAAGGCG 
GGAGAATTAA 
ATTTGGTTAA 
TGTCGACCCC 
ATTAATTGGT 
GTTGCTCGAA 
GCACGCATAA 
TCTGTACGTT 
GAAATTTCCA 
TGAATGATGC 
TGCAATGTCA 
AAACGACCAG 
GCCATGTTAG 
AATTGACCAT 
CGACGTGCAG 
ACGGTCATTG 
GATTGTAAAA 
TCAAAGTACC 
GTACATGGAA 
GAATATGAAA 
TATACGTGGG 
ACATGACAGG 
TTAATAGATG 
TTAGCAGATC 
TTTGCGTGTA 
CGAGTTGGCT 
TTTAGAATGA 



TAAAAGCAGC 
TATCATGTGG 
AAGACATGCC 
TCCCGCCACT 
CTGGTCGACC 
AGCTAGCGAT 
TGAACTCGTC 
ATCCTAAAAT 
CTTCTAAACA 
TATTCGAAAA 
CACCATCAAA 
TATTGATTTT 
TAGGGCTTGG 
CATGCGCTAg 
TTAATCCAGG 
AAGGTTCAAT 
CATAAGCCAA 
GTGATAATAC 
GTGGTTTGTA 
ATGGAAAGAA 
AGTATTTTTA 
AAGAAAAATA 
TATATGAAGA 
ATAAGTTTGG 
AAGGTAAGGA 
GACTTGGCTT 
GTAGTCCAGA 
CTGGTGGTGT 
TGGCAGCGTT 



ATGTGGAATG 
TCGCTGTTCA 
GACCATATCT 
TGCAATTAAA 
GAAATGATCA 
TGCTAAGTAT 
AATGGTATAT 
AAAATTGTCA 
TAATCTTGCA 
AGTTGAGAAA 
ACCTGCTTTA 
CTCATGAGAC 
TCCATACACC 
CTGGATAATA 
GATACAAGCA 
GTAAGCAGCG 
GTCTTCTTTT 
AAAGCGATTC 
TCGGTACATA 
TGAATAATCA 
AAATTAAACT 
AAGGCGAAAA 
GGGAATGGTA 
AGATGACGGA 
TGCACAAAAG 
TAAGCGAATT 
ACTTTTGATG 
GATGCTGCCG 
ATATCCAAAT 



GCTAAATCTT 
GCAAGTTTAT 
GCATGTTGTA 
GGGATACGAC 
CCTGGTGTAC 
TGGATGTTTG 
CCTAAATCAC 
GGTGCTTCTT 
CGATTTTTTA 
AATGTTTGAA 
ATCGCGCGTA 
ATGGCGATAA 
TTTCCAAAAT 
GCGAGGCTAC 
TCATGATCAA 
CCGGTGACTT 
GTAATATAGC 
GAAATTTTGA 
CTATGATTCC 
ATGATGAACA 
AATGAATGGC 
GATATAAAAG 
TTAAGAACGC 
TTGGTTAAAT 
GCATTGCAAG 
TGGTTTACGG 
ATGCATACAT 
CACTATCGAC 
CGTATTGATT 



CTAAATCTGC 
GCACAAAGTC 
AAGCATCTAA 
CTGCTAAATG 
GAGACGTATT 
AAACGTCCAT 
TGCCTCTGGT 
TATCAATCAC 
ATGAGTCGGC 
TCAGCAAACG 
ATGTAGCATC 
CATCGTGTTC 
TTAAAATGGC 
CATGTTGTTT 
TATTAAAGCC 
GCATTCCAGC 
CTTCTTTTGT 
TGCCATTAGG 
TTTTCTATTC 
GTCTTGATAG 
ATTTGTAGGT 
TTAATTGAAA 
TAAAATGTTA 
TAAGCGTATT 
ATTCAGTGAC 
AACATCATAA 
TGGCGCAGAC 
CTTATAAAAT 
TAGGTATTGG 



16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
172B0 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17B80 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
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1 AG 1 1 A\JGA 1 


GAATCGATTT 


CGTTATTACG 


TGATTATCTT 


ACAATAAAGG 


ATAAACCAAG 


18600 






I IAGGTGTCC 


tv iv t~*r* iv /■* jv t> 

AACCACACAT 


TGATCATTTT 


CCAGAAATGT 


GGTTATTAAG 


18660 


5 


inb 1 AGl~uV_A 


A CATCTG CCA 


IV IV TV T*1V WTI^A 

AAATAG CTG C 


CGAACTAGGT 


ATAGGGCTTT 


CTGTTGGAAC 


18720 




ATTTTTGCTA 


C CAGAT ATAA 


ATGCGATACA 


TACAGCGAAG 


GATAACATTG 


ATATTTACAA 


18780 


10 


a r tarn tttp 


CAAGCATCAA 


CGATTAAAAT 


GGACGCAAAG 


GTGATGGCAT 


CTGTATTTGT 


18840 




GA 1 AACGAAG 


CGGAAGTAGC 


AG CATTACAA 


CATGCCTTAG 


ATGTTTGGTT 


18900 




» • i * i ■tv r*r*T iv iv jv 
ATTAGGTAAA 


TTACAATTTG 


CAGAATTTGA 


AGATTTTCCT 


TCAGTAGACA 


CAGCACAAAA 


18960 


15 


GTATAAG CTT 


AATGATCGAG 


ACAAAGAGAT 


GATTCAAGCA 


CATCAAGCAC 


GCATCATTGC 


19020 




AGGTACACAA 


GAAAAGGTTA 


AAGCACAATT 


AGATGATTTC 


ATTGCTACGT 


TTGAAGTTGA 


19080 




TGAGGTGTTA 


GTAGCACCGC 


TTATTCCAGG 


TATTGAACAG 


CGTTGTAAAA 


CATTAAAATT 


19140 


20 


ACTCGCGGAA 


ATTTATTTGT 


AGCATTTTAA 


ATAGAAGAGA 


AAGGATGAAG 


ATAAGATGAA 


19200 




AAAGTTAGCC 


AATTATTTAT 


GGGTAGAAAA 


AGTAGGAGAT 


TTGTATGTGT 


TTAGTATGAC 


19260 




ACCTGAATTG 


CAAGATGATA 


TTGGGACAGT 


AGGTTATGTT 


GAATTCGTAA 


GTCCAGATGA 


19320 


25 


AGTTAAAGTG 


GATGATGAAA 


TTGTGAGTAT 


CGAAGCATCG 


AAAACGGTCA 


TTGATGTGCA 


19380 




AACGCCATTG 


TCAGGAACGA 


TTATTGAGCG 


AAATACAAAA 


GCGGAAGAAG 


AACCGACAAT 


19440 




TTTAAACTCT GAAAAACCAG AAGAAAATTG GTTGTTCAAA 


TTGGATGATG 


TCGATAAAGA 


19500 


30 


AGCATTCCTA 


GCATTACCGG 


AGGCTTAAAT 


GGAAACGTTA 


AAATCAAATA 


AAGCGAGACT 


19560 




TGAATATTTA 


ATCAATGATA 


TGCATCGAGA 


GAGAAATGAC 


AATGACGTAT 


TGGTAATGCC 


19620 




ATCTTCATTT 


GAAGATTTGT 


GGGAATTATA 


TCGAGGCTTA 


GCAAATGTCA 


GACCGGCATT 


19680 


35 


ACCTGTAAGT 


GATGAATATT 


TAGCTGTACA AGATGCTATG 


TTAAGTGATT 


TGAATCGTCA 


19740 




ACATGTTACG 


GATTTGAAGG 


ATTTGAAGCC 


GATAAAAGGT 


GACAATATCT 


TTGTTTGGCA 


19800 


40 


AGGTGATATC ACGACGTTAA AAATCGATGC TATTGTTAAT 


GCTGCAAATA 


GTCGTTTTCT 


19860 


AGGATGTATG 


CAAGCTAATC 


ATGACTGCAT 


TGATAATATT 


ATTCATACAA 


AAGCGGGTGT 


19920 




TCAAGTTCGA 


CTTGATTGTG 


CAGAGATCAT 


TCGACAACAA 


GGGCGCAATG 


AAGGTGTAGG 


19980 


45 


TAAAGCCAAA 


ATAACACGTG 


GATATAATTT 


GCCAGCAAAG 


TATATAATTC 


ATACGGTTGG 


20040 




TCCGCAAATA 


CGTCGATTGC 


CTGTTTCAAA 


GATGAATCAG 


GACTTGTTAG 


CTAAATGTTA 


20100 




TCTTAGCTGT 


CTTAAATTGG 


CTGATCAACA 


TAGTTTAAAT 


CATGTCGCTT 


TTTGCTGTAT 


20160 


SO 


ATCTACAGGT 


GTATTTGCTT 


TTCCTCAAGA 


TGAAGCAGCA 


GAAATTGCTG 


TTCGAACAGT 


20220 




AGAAAGCTAT 


CTCAAAGAAA 


CAAATTCAAC 


ATTGAAAGTC 


GTGTTCAATG 


TATTTACAGA 


20280 



55 



350 



EP 0 786 519 A2 



CAATGTCTCT GTTAATGGAT GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 20400 

ATGAAGCAGA TGCGATAGTG ATTGGAATTG GTGCAGGCAT GTCTGCATCT GACGGATTTA 20460 

5 

CATATGTAGG AGAGCGTTTT ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 20520 

TTGATATGTT GCAAGCGAGT TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGGCATTTG 20580 

AGAGTCGTTT TATTACATTA AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCGCTT 20640 

10 

TAAAATCCTT GGTGGAAGGT AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 20700 

TCGATGTAGC TGATTATGAT ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 20760 

. AACAGTGTAG cTCAGCATTG TCATGCTCAA ACGTATCGCA ATGATGATTT AATTCGTAAA 20820 

15 

ATGGTTGTTG CGCAACAAGA TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 2 08 SO 

TGTGATGCCC CAATGGAAGT GAATAAACGT AAAGCGGAAG TTGGGATGGT TGAAGATGCT 20940 

20 GAATTTCATG CGCAACTACA TCGTTATAAT GCTTTTCTAG AGCAACATCA AGATGATAAA 21000 

GTGTTGTATT TGGAAATTGG AATTGGTTAT ACTACACCAC AATTTGTGAA GCATCCTTTT 21060 

CAGCGTATGA CACGTAAAAA TGAAAATGCC CTTTATATGA CGATGAATAA AAAGGCATAT 21120 

25 CGCATTCCGA ATTCAATTCA AGAACGTACC ATACATTTAA CTGAGGATAT CTCAACATTG 21180 

ATTACAGCAG CACTCCGGAA CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 21240 

GATGTACTTA ATAGAACCGA TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT 21300 

30 CGCTATGCAA GTTTATGTTA ACCAGCATAT CTTTTTAGAT GAAGATATTT TATTCCCTTA 21360 

TTATTGTGAT CCAAAAGTGG AAATTGGACG TTTTCAAAAT ACTGCTATAG AAGTGAATCA 21420 

AGATTATATA GATAAACACA GTATTCAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 80 

35 

GTATGTTGAT AAAGGTGCCG TTAATATGTG TTGTATTTTA GAACAAGACA CTTCAATTTA 21540 

TGGTSATTTT CAACGATTTT ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 21600 

AGATGTGGTA CAAAGCGGTA GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 21660 

40 

CGCAATGACA TTAATGAATA ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TTATGAAGCA ATGGATAAAG TGTTAAAGCC TAATCGCAAA AAGATTGCAT CGAAAGGGAT 21780 

TAAATCTGTG CGCGCACGTG TTGGTCATCT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 21840 

45 

TATAACCATT GAAGAATTTA AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

TAAAGAGGCG AAACGATATG AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

50 TGATAAAAAG TATAAAAATT GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22020 

TCGAAGTGAA AGATTATCTT CAGGTACGGT AGACATAACA ATTTCTGTTG AACAAAATCG 22080 
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AGAAGCATTA CAAGGAACAA AAATGACAAG AGAAGATTTA ACGCATCAGT TAAAGCAATT 222 00 

AGACATCGTT TATTATTTTG GCAATGTTAC GGTAGAAGCA TTAGTGGATA TGATTTTAAG 22260 

TTAATATTGT TATTTTATGT ATGCTGAATC ATTGGAAGTG TTTGCTTGCT CTTGAAAAGG 22320 

TGACAATAGT GTTTGGTGAA GGTTGAACAT ATGAGTGGAA ATTATTGCCT TTAACTATTC 223 80 

AAAGTATGAT ATATATATGG TTTTTGTTTC TAAATGATTG GGTATTTGAA AATAGATGAG 22440 

10 

TTTAATATTT TAAGGAATAT AATGATGTTT ACTTTTATAA TTCATATAGA ATATTAAGCA 22500 

ATATAAGTCT GTTGATATAT ACAAAATATA ATGACTGCTA TAATGAGTAA TCAATAGACA 22560 

15 CAAAGAGGAG ATTATGTGAT GAATAATAAA GTATTAGTAA CCGGTGGTAC AGGGTTTGTT 22620 

GGCATGCGAA TTATTTCACG ATTATTAGAA CAAGGTTATG ACGTACAAAC GACGATACGT 22680 

GATTTAAGTA AAGCTGATAA AGTAATTAAA ACAATGCAAG ACAATGGCAT TTCCACAGAG 22740 

20 CGATTAATGT TTGTCGAAGC GGATTT AT CA CAAGATGAAC ATTGGGATGA AGCAATGAAA 22800 

GATTGCAAGT ATGTCTTGAG TGTAGCATCT CCGGTGTTTT TCGGTAAAAC AGACGATGCA 22860 

GAAGTGATGG CGAaCTGcAA TTGAAGGTAT ACAACGTATT TTAAGAGCTG CAGAACATGC 22920 

25 GGGTGTTAAA CGTGTGGTAA TGACTGCAAA CTTTGGTGCA GTTGGTTTTA GTAATAAAGA 22980 

TAAAAATTCA ATCACAAATG AAAGTCATTG GACAAATGAA GATGAACCAG GCTTATCAGT 23040 

ATATGAAAAA TCAAAATTGT TAGCTGAAAA GGCAGCGTGG GATTTTGTTG AGAATGAAAA 23100 

TACAACAGTA GAATTTGCCA CAATCAATCC AGTTGCAATT TTTGGGCCAT CATTAGATGC 23160 

ACACGTTTCA GGAAGCTTTC ATTTATTAGA AAATTTATTG AATGGTTCAA TGAAACGTGT 23220 

ACCGCAAATT CCGTTAAATG TTGTTGATGT GAGAGACGTA GCTGAACTGC ACATTTTGGC 23260 

35 

AATGACAAAT GAACAAGCTA ATGGCAAGCG ATTTATTGCG ACGGCTGATG GACmAATTwA 23340 

tTTGTTGGGA ATTGcCAAAt TAATTAAAGA AAAGGGCCTG GAAATAGCTC CAAAAGTTCC 234 00 

TACTAAAAAA TTACCCAGCT TTATTTTGAG CnAnGnGCC 23439 

40 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4522 base pairs 
^ (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 60 
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TATTATGCAG 


TCGATTTAGG GAAATCATAT CGTCTAATTG ACGAAAGCAT GTTAGAGGAT 


180 




TTGAAGTTAA 


CTGAACAACA AATAAGAGAA ATGTCTCTGT TTAATGTTAG AAAATTGTCA 


240 


c 
o 


AATTCATATA 


CGACTGATGA AGTAAAAGGT AATATTTTTT ATTTTATTAA CTCAAATGAC 


300 




GGGTATGATG 


CAAGTAGGAT ACTAAATACT GCATTTTTAA ATGAAATTGA GGCACAATGT 


360 


10 


CAAGGCGAAA 


TGCTCGTAGC AGTGCCACAC CAAGATGTGT TAATTATTGC AGATATACGC 


420 


AATAAAACAG 


GATATGATGT GATGGCACAT TTAACAATGG AATTTTTCAC TAAAGGTCTA 


480 




GTTCCAATTA 


CATCATTATC CTTTGGATAT AAACAGGGTC ATCTTGAACC GATATTTATT 


540 


15 


TTAGGTAAAA 


ATAATAAACA AAAAAGAGAT CCAAACGTGA TTCAGCGTTT AGAAGCAAAT 


600 


CGTCGTAAAT TTAATAAAGA TAAATAGAAA TAATTGGATA AGGAGTTTTG TCATAATGAA 


660 




TTTATTTTAC 


AATCCTAAAT ATGTAGGAGA TGTCGCATTT TTACAAATTG AACCAGTTGA 


720 


20 


AGGTGAATTA 


AACTACAATA AAAAAGGTAA TGTTGTTGAA ATTACtAATG AAGGTAATGT 


780 




TGTAGGTTAT 


AATATTTTTG AAATTTCAAA AGATATAACA ATTGAAGAAA AAGGTCATAT 


840 




TAAATTAACT 


GATGAACTTG TAAATGTATT CCAAAAGCGT ATTTCAGAAG CTGGTTTTGA 


900 


25 


TTATAAATTA 


AATGCTGATC TATCACCGAA ATTTGTAGTT GGCTACGTTG AAACTAAAGA 


960 




CAAACATCCT 


GATGCAGATA AATTAAGTGT ACTAAATGTA AACGTTGGAA ATGACACATT 


1020 




ACAAATTGTA 


TGTGGCGCGC CTAACGTTGA AGCTGGACAG AAAGTTGTTG TTGCTAAAGT 


1080 


30 


AGGTGCAGTG 


ATGCCTAGCG GTATGGTAAT TAAAGATGCT GAATTACGTG GTGTTGCCTC 


1140 




AAGCGGTATG 


ATTTGTTCAA TGAAAGAATT GAATTTACCT AATGCACCTG AAGAAAAAGG 


1200 




TATTATGGTA 


TTAAATGACA GCTATGAAAT TGGACAAGCA TTtTTTGAAT AATTAAGGAA 


1260 


35 


GGTAGTGAAA 


ATATGAGCTG GTTTGATAAA TTATTCGGCG AAGATAATGA TTCAAATGAT 


1320 




GACTTGATTC ATAGAAAGAA AAAAAGACGT CAAGAATCAC AAAATATAGA TrACGATCAT 


1380 


40 


GACTCATTAC 


TGCCTCAAAA TAATGATATT TATAGTCGTC CGAGGGGAAA ATTCCGTTTT 


1440 


CCTATGAGCG 


TAGCTTATGA AAATGAAAAT GTTGAACAAT CTGCAGATAC TATTTCAGAT 


1500 




GAAAAAGAAC AATACCATCG AGACTATCGC AAACAAAGCC ACGATTCTCG TTCACAAAAA 


1560 


45 


CGACATCGCC 


GTAGAAGAAA TCAAACAACT GAAGAACAAA ATTATAGTGA ACAACGTGGG 


1620 




AATTCTAAAA 


TATCACAGCA AAGTATAAAA TATAAAGATC ATTCACATTA CCATACGAAT 


1680 




AAGCCAGGTA 


CATATGTTTC TGCAATTAAT GGTATTGAGA AGGAAACGCA CAAGCCAAAA 


1740 


50 


ACACATAATA 


TGTATTCTAA TAATACAAAT CATCGTGCTA AAGATTCAAC TCCAGATTAT 


1800 




CACAAAGAAA 


GTTTCAAGAC TTCAGAGGTA CCGTCAGCTA TTTTTGGCAC AATGAAACCT 


1860 
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10 



15 



25 



35 



40 



45 



50 



55 



AAACAAAAAT 


ATGATAAATA 


TGTAGCTAAG ACGCAAACGT 


CTCAAAATAA 


ACAATTAGAA 


1980 


GAAGAAAAAC 


AAAATGATAG 


TGTTGTCAAA CAAGGAACTG 


CATCTAAATC 


ATCTGATGAA 


2040 


AATGTATCAT 


CAACAACAAA ATCAATGCCT AATTATTCAA AAGTTGATAA TACTATCAAA 


2100 


ATTGAAAATA 


TTTATGCTTC 


ACAAATTGTT GAAGAAATTA 


GACGTGAACG 


AGAACGTAAA 


2160 


GTGCTTCAAA 


AGCGTCGATT 


TAAAAAAGCG TTGCAACAAA 


AGCGTGAAGA 


ACATAAAAAC 


2220 


GAAGAGCAAG 


ATGCAATACA 


ACGTGCAATT GATGAAATGT 


ATGCTAAACA AGcGGAACgC 


2280 


TATGTTGGTG 


ATAGTTCATT 


AAATGATGAT AGTGACTTAA 


CAGATAATAG 


TACAGATGCT 


2340 


AGTCAGCTTC 


ATACAAATGG 


CATAGAGAAT GAAACTGTAT 


CAAATGATGA 


AAATAAACAA 


2400 


GCGTCAATAC 


AAAATGAAGA 


CACTAATGAC ACTCATGTAG 


ATGAAAGTCC 


ATACAATTAT 


2460 


GAGGAAGTTA GTTTGAaTCA AGTATCGACA ACAAAACAAT TGTCAGATGA 


TGAAGTTACG 


2520 


GTTTCGAATG 


TAACGTCTCA 


ACATCAATCA GCACTACAAC 


ATAACGTTGA 


AGTAAATGAT 


2560 


AAAGATGAAC 


TAAAAAATCA 


ATCCAGATTA ATTGCTGATT 


CAGAAGAAGA 


TGGAGCAACG 


2640 


aATAAAGAAG 


AATATTCAGk 


AAGTCAAATC GATGATGCAG 


AATTTTATGA 


ATTAAATGAT 


2700 


ACAGAAGTAG 


ATGAGGATAC 


TACTTCAAAT ATCGAAGATA 


ATACCAATAG 


AAACGCGTCT 


2760 


GAAATGCATG 


TAGACGCTCC 


TAAAACGCAA GAGTACGCAG 


TAACTGAATC 


TCAAGTAAAT 


2820 


AATATCGATA 


AAACGGTTGA 


TAATGAAATT GAATTAGCAC 


CGCGTCATAA 


AAAAGATGAC 


2880 


CAAACAAACT 


TAAGTGTCAA 


CTCATTGAAA ACGAATGATG 


TGAATGATAA 


TCATGTTGTG 


2940 


GAAGATTCAA 


GCATGAATGA 


AATAGAAAAG AATAACGCAG 


AAATTACAGA 


AAATGTGCAA 


3000 


AACGAAGCAG 


CTGAAAGTGA 


ACAAAATGTC GAAGAGAAAA 


CTATTGAAAA 


CGTAAATCCA 


3060 


AAGAAACAGA 


CTGAAAAGGT 


TTCAACTTTA AGTAAAAGAC 


CATTTAATGT 


TGTCATGACG 


3120 


CCATCTGATA 


AAAAGCGTAT 


GATGGATCGT AAAAAGCATT 


CAAAAGTCAA 


TGTGCCTGAA 


3180 


TTAAAGCCTG TACAAAGTAA GCAAGCTGTG AGTGAAAGAA TGCCTGCGAG 


TCAAGCCACA 


3240 


CCATCATCAA 


GATCTGATTC 


ACAAGAGTCA AATACAAATG 


CATATAAAAC 


AAATAATATG 


3300 


ACATCAAACA 


ATGTTGaGAA 


CAATCAACTT ATTGGTCATG 


CAGAAACAGA 


AAATGATTAT 


3360 


CAAAATGCAC 


AACAATATTC 


AGAGCAGAAA CCTTCTGTTG 


aTTCAACTCA 


AACGGAAATA 


3420 


TTTGAAGAAA 


GTCAAGATGA 


TAATCAATTG GAAAATGAGC 


AAGTTGATCA 


ATCAACTTCG 


3480 


TCTTCAGTTT 


CAGAAGTAAG 


CGACATAACT GAAGAAAGCG 


AAGAAACAAC 


ACATCCAAAC 


3540 


AATACTAGTG 


GACAACAAGA 


TAATGATGAT CAACAAAAAG 


ATTTACAGTC 


ATCATTTTCA 


3600 


AATAAAAATG 


AAGATACAGC 


TAATGAAAAT AGACCTCGGA 


CGAACCAACA AGATGTTGCA 


3660 
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5 



10 



15 



20 



25 



CCAAGTGTTT 


CATTACTAGA 


AGAACCACAA GTTATTGAGT 


CGGACGAGGA 


CTGGATTACA 


3780 


GATAAAAAGA 


AAGAACTGAA 


TGACGCATTA TTTTACTTTA ATGTACCTGC AGAAGTACAA 


3840 


GATGTAACTG 


AAGGTCCAAG 


TGTTACAAGA 


TTTGAATTAT 


CAGTTGAAAA 


AGGTGTTAAA 


3900 


GTTTCAAGAA 


TTACGGCATT 


ACAAGATGAC 


ATTAAAATGG 


CATTGGCAGC 


GAAAGATATT 


3960 


CGTATAGAAG 


CGCCTATTCC 


AGGAACTAGT 


CGTGTTGGTA 


TTGAAGTTCC 


GAACCAAAAT 


4020 


CCAACGACAG 


TCAACTTACG 


TTCTATTATT 


GAATCTCCaA 


GTTTTAAAAA 


TGCTGAATCT 


4060 


AAATTAACAG 


TTGCGATGGG 


GTATAGAATT 


AATAATGAAC 


CATTACTTAT 


GGATATTGCT 


4140 


AAAACGCCAC 


ACGCACTAAT 


TGCAGGTGCA 


ACTGGATCAG 


GGAAATCAGT 


TTGTATCAAT 


4200 


AGTATTTTGA 


TGTCTTTACT 


ATATAAAAAT 


CATCCTGAGG 


AATTAAGATT 


ATTACTTATC 


4260 


GATCCAAAAA 


TGGTTGAATT 


AGCTCCTTAT 


AATGGTTTGC 


CACATTTAGT 


TGCACCGGTA 


4320 


ATTACAGATG 


TCAAAGCAGC 


TACACAGAGT 


TTAAAATGGG 


CCGTAGAAGA 


AATGGAACGA 


4380 


CGTTATAAGT 


TATTTGCACA 


TTACCCATGT 


ACGTAnTATA 


ACAGCATTTA 


ACnAAAAAGC 


4440 


CCCATATGAT 


GAAAGAATGn 


CAAAAATTGT 


CATTGTAaTT 


GATGAGTTGG 


CTGATTTAAT 


4500 


GATGATGGTC 


CGCAAGAAGT 


TG 








4522 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



40 



45 



50 



TCAAGTTTAC 


GGATACGTAT 


ATATTTTGCA 


TGACATTTAG 


TGCAATAATA 


TTCATAATTT 


60 


GCCCGTTGTT 


GATAGCTTTC 


AATGCTGTTA 


CAAAATCTAG 


GCGCTCCAAC 


CTGTTGGCTC 


120 


AATCGTTTAA 


AATCTTGATC 


TTTATGTTGA 


TAACCTTTAC 


CAGCAATATG 


CAAGTGATAA 


180 


TGACACAATT 


CGTGCAGTAT 


AATTTTTACA 


ACAGCATCTT 


CTCCATAATG 


CTCATATTGT 


240 


TTTGGATTAA 


TTTCAATATC 


ATGGGACTTT 


AAAAGATAAC 


GTCCGCCTGT 


TGTACGTAAC 


300 


CTTTTATTAA 


AATATGCACA 


ATGTCGAAAC 


GTACGTCCAA 


ATTTTTCTTC 


CGAAAGATTC 


360 


TCAACCATTC 


GCTGAAGTTT 


GTCATTATTC 


ATGTGGATCA 


ATCATCGTTA 


ATGATACTTT 


420 


GTCTTTATTT 


TTGTCAATAC 


TGTAAATCCA 


AACGTCAACG 


ATATCACCAA 


CACTGACAAT 


480 


ATCCATTGGA 


TTTTTTACGA 


ACTTCTTAGA 


AAGTTTCGAA 


ACATGGACAA 


GTCCATCTTG 


540 
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TTTCATTCCT TCTTGTAAAT CTTCAATTGA TAGCACATCG GATTTAAGGA TTGGTGTTTC 660 
AAACTCGTCC CTTGGATCTC GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 720 

5 

AGGTACACCG ACTTGTAATT CAATCGCCAG T 751 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid. 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



20 



40 



45 



TCTCCAGCTT 


TAACTTGATC 


TGGCACTTTA 


ACAATTGTCT 


GATCCATACA 


TACGCGACCA 


60 


ATAACTTCGC 


ATTGATGACC 


ATTTACATTT 


ACAAAGCTAC 


CTTGCATTAT 


GCGTAAATGG 


120 


CCATCTGCAT 


ATCCAATAgG 


TAACAATGCT 


ATTGTAGTTG 


GGTCAGTAGC 


TGTATAAGTT 


180 


GCACCATAAC 


TTACAGACTC 


ACCCGCTTGT 


AGCGTCTTTG 


TTTGAACTAC 


ATTAGCAATT 


240 


AATTGCACAC 


TTGGTTTAAG 


GTGTACTTTA 


ACTTTTTGCT 


GTACATACTC 


TGATGGATAA 


300 


TATCCATAAA 


GGGAAATTCC 


TGGTCTTATT 


GCATTACAGA 


ATTGGCAATC 


CATTAATAGA 


360 


GAGCCTGCTG 


AGTTCTGACA 


ATGTATATAT 


TCAGGTTTAA 


TTGCTTCATT 


GACCATATCT 


420 


TTAAAACGTT 


GATATTGTTC 


AGTTGTCATA 


TCTCCTGGTT 


CGTCAGCACA 


GGCAAAGTGT 


460 


GTAAACACGC 


CTTCAAATAC 


AAGTTGCTCA 


TATTGTTGAA 


TGATTTCAAT 


CACTTCTTGA 


540 


TACGTTTTAG 


TATCTTTAAT 


ACCTAAACGT 


CCCATTCCTG 


TATCTAATTT 


AATGTGCAAC 


600 


CATAACTTTT 


TCTCTTGCTC 


ACCAGAAATG 


TTTTTAATTG 


CTTCTTTCAA 


CCACTGTTTA 


660 


gacgSaaccg 


TTAAGGCAAC 


TCGGTGTTGT 


ATCGCTTTAT 


CAATATCTTT 


AGCTGGTAAC 


720 


ACACCTAAGA 


CTAAAATTTT 


AGCAGTAATC 


CCATGCATTC 


TAAGTTCTAT 


CGCTTCATCT 


780 


AACGTTGCTA 


CAGCAAAAAA 


TGTGGCGCCA 


TTTTCCATTA 


AATGACGTGC 


TACTTTAACA 


840 


CTACCTAGTC 


CATAGGCATT 


GGCTTTAACG 


ACAGCCATCA 


CTGTTTTATT 


TGGATGCAAT 


900 


GTACTGAATA 


CTTTGAAATT 


TGATGCAACA 


GCGTTTAAAT 


CTACATTCAT 


ATACGCAGAT 


960 


CTATAATATT 


TATCCGACAT 


ATTACTTCCT 


CCTGTAATTC 


CCACACGTTT 


TAAAACTAGA 


1020 


TCTTAATTAT 


CATTGTATAA 


CAAATTTAAA 


ATGCTGACTT 


TTCTAAAACA 


ACTTGG 


1076 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2930 base pairs 
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(C) STRAND EDNESS : double 
(Di TOPOLOGY: linear 



70 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATGAGT CAATAAAGCG 60 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTGCCAACT CTCCAATTGA 120 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA GCGAGTATAG CGCCTCCAGG 180 

ACCAGCTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA TTGCACCACC 240 

15 TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 300 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 360 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 420 

20 GAGCATGTTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 480 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 600 

2S TGCAAGATAT TTACTTTTTA GAGCAAATGT CTCAATTTGA TAGAGAAGTA ATACCAGAAC 660 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 720 

CAAAATATAC GAATGCTAAA AtATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 780 

GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGAcGTGACA TTCGAGGATT 640 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 900 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA AACGAGATCC 960 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 1020 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 1080 

CATGGGTTCG GTTCTCACAC ATACTCTATG TATAATGATT CTGGTGAACG TGTTTGGGTT 114 0 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 1200 

ATTATAGCTA CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 1260 

GATTATCCAA AATGGACAAT GTATATTCAA GTAATGACTG AGGAACAAGC TAAAAACCAT 1320 

AAAGATAATC CATTTGATTT AACAAAAGTA TGGTATCACG ATGAGTATCC TCTAATTGAA 1380 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 144 0 

SO GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 1500 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 156 0 



55 



30 



35 



40 



45 



357 



EP 0 786 519 A2 

GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 1680 

CATGGTAAAT TTGATTCTCA ACCTGAATAT AAAAAGCCAC CATTCCCAAC TGATGGATAC 174 0 

GGCTATGAAT ATAATCAACG TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG 1800 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTTTA CAAATACAGC AAATGCAATG 1860 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATTCGTC ATTGTTACAA AGCTGACCCA 1920 

GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 1980 

ACTGAAAATG ATGAAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 204 0 

TTGCGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 2160 

ATCACTACTA AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 2220 

AGATTAAACA AATATACGTT ACATCGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 228 0 

GATTTGAAAA TAAAGCGGGC TTACCTATTA TATTGGGGAG CTCGCTTTTT TATGAAATTT 2340 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 24 00 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA TAGATTACTG ATTTTATAAA 24 60 

GTCACTTTGT TAGCGAATGC TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2 520 

GCGGAATTTA ATAAGTACGA AGTAGTTCTG GGTATGTTTT ATAAATGTTC GATAATACAC 2580 

TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 264 0 

GAATTATGTG AGGATTGTGT TATTATATAA ATCGTAATAA TTACGATTTG ATAAAAAGTG 2700 

AGGTAACTAT ATATGGCTAA GAAATCTAAA ATAGCAAAAG AGAGAAAAAG AGAAGAGTTA 2760 

35 GTAAATAAAT ATTACGAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 2880 

AGACCTAGAG GTGTATTACG TAAATTTGAA ATGTCTCGTA TTGCGTTTAG 2930 
40 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 606 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
45 (D) TOPOLOGY: linear 



10 



15 



20 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

50 

CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT . 60 
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TTATAAAAAA CTAATTTTAC AAATGCTTTT GCGTTCTTAC AAAAAATGCA 


TTTGACTATT 


180 




ATTATAATAA 


GCGTATAATT 


GTCGCATATT 


ATTTTTTGTA 


TTTTTGGCAA 


TAACGAAGGA 


240 


5 


GTATTTATGA 


ATAAAGACAA 


GCAATTGCAC 


AACGACAAAA 


TCAATCTATC 


CCAATTAGTC 


300 




TTATTAGGGT 


TAGGCTCTTT 


AATAGGATCT 


GGTTGGCTAT 


TTGGTGCGTG 


GGAAGCATCA 


360 




TCAATAGCTG 


GACCAGCAGC 


AATCATATCA 


TGGGTTCTTG 


GATTCCTAGT 


CATTGGAACC 


420 


in 


ATTGCCTATA ACTACATTGA AATCGGCACA ATGTTTCCTC AATCAGGTGG 


CATGAGTAAC 


480 




TATGCCCAGT 


ATACACATGG 


CTCATTATTA 


GGCTTTATTG 


CTGCTTGGGC 


GAATTGGGTG 


540 




TCTTTGGTGA 


CAATAATACC 


TATCGAAGCT 


GTGTCAGCTG 


TTCAATATAT 


GAGTTCTTGG 


€00 


15 


CCGTGGCATT 


GGGCGAAACC 


AATGAGATAT 


TTAATGGAAA 


ATGGCTCTAT 


TAGCACATAC 


660 




GGATTGCTAG 


CTGTATATCT 


CATCATTGTT 


ATTTTTTCAT 


TATTAAACTA 


TTGGTCCGTA 


720 


20 


AAACTTTTAA 


CATCATTTAC 


GAGTTTAATT 


TCTGTATTTA 


AATTAGGCGT 


ACCCATGTTA 


780 


ACCATCATCA 


TGTTGATGCT 


ATCAGGATTC 


GACACTTCAA 


ATTACGGCCA 


TTCGGCAAGC 


840 




ACATTTATGC 


CTTACGGAAG 


TGCACCGATT 


TTTGCTGCAA 


CAACAGCATC 


AGGGATTATT 


900 


25 


TTTT CATTC A 


ATTCATTCCA 


GACAATTATT 


AATATGGGTT 


CAGAAATTAA 


AAATCCTGAA 


960 


AAAAATATCG 


CAAGAGGCAT 


CGCTATCTCA 


CTGTCAATCA 


GTGCAGTGTT 


GTACATCATT 


1020 






CGTTTATCAC 


TTCTATGCCT 


CAATCAATGT 


TACAACATAG 


TGGATGGAAT 


1080 


30 






ATTTGCTGAT 


TTAGCTATCT 


TATTAGGAAT 


TAATTGGCTC 


1140 






T & Ta r* & TTY2 A 


AGCTTTTGTA 


l LALCAI ICG 


/■< T* IV PTW /"V"*T 

GTACTGGCGT 


GTCATTTGTC 


1200 




GCCGTTACAG 


GTCGAGTTTT 


ACGAGCAATG 


GAGAAAAATG 


GACATATCCC 


TAAATTTCTT 


1260 


35 


GGGAAGATGA 


ATGAAAAATA 


TCATATCCCA 


CGTGTAGCAA 


TCATCTTTAA 


TGCCATCATT 


1320 




AGTJQ'GATTA 


TGGTTACATT 


ATTTAGAGAT 


TGGGGTACGC 


TAGCAGCAGT 


TATTTCTACT 


1380 




GCAACTTTAG 


TAGCCTATTT 


AACTGGCCCA 


ACGACAGTGA 


TTGCATTAAG 


AAAAATGGGA 


1440 


40 


CCAACAATGA 


CTCGTCCATT 


TAGAGCAAAA 


ATTTTAAAAG 


TAATGGCACC 


ATTATCATTT 


1500 




GTATTAGCTT 


CATTAGCTAT 


ATATTGGGCA 


ATGTGGCCAA 


CAACGGCTGA 


AGTTATTTTA 


1560 




ATCATTATAC 


TTGGATTACC 


AATCTACTTC 


TTCTATGAAT 


ATCGTATGAA 


TTGGCGTAAT 


1620 


45 


ACAAAGAAAC 


AAATTGGTGG 


TAGCTTATGG 


ATTATTGTAT 


ATTTAATCGT 


GCTATCAATA 


1680 




CTGTCATTTA 


TAGGAAGCAA 


AGAATTTAAA 


GGCTTAAATA 


TGATTCACTA 


TCCATTTGAC 


1740 




TTTATCGTTA 


TTATTATTGT 


GGCACTTATC 


TTCTATTACA 


TCGGTACAAC 


GAGTTCATTT 


1800 


50 


GAAAGCGTCT 


ATTTCCGTCG 


CGCAACACGA 


ATCAATACGA 


AGATG CGTGA 


GTCACTAAAT 


1860 
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CACACACATT 


AACCAACCAT 


TGATTTCAAC 


ATCTTGGTTG 


GTTTTTTATT 


TTGAAAATCG 


1980 




GTTATAAATA 


ACTAACATAA 


CAAGATGATG 


ATCAGGCTGG 


GACATAAATC 


AATGTTCTAT 


2040 


5 


GCTCTACGAA 


gTTATATTGG CAGTAGTTGA CTGAACGAAA ATGCGCTTGT AACAAGCTTT 


2100 




TTTCGATTCT 


AGTCAGGGGC 


CCCAACACAG 


AGAATTTCGA 


AAAGAAATTC 


TACAGGCAAT 


2160 




GCAAGTTGGG 


GTGGGACGAC GATAAAGAAA TACTTTTTCT ATAGAAATTA GTATytCTTA 


2220 


10 


TGCATGAGTT 


TTACTCATGT 


ATTCATATTT 


TTAAGTACAC 


ATTAGCTGTG 


GCTAATGTAT 


22S0 




AAGAACCACT 


ACATAATAAA 


TCATTTGTGG 


CTCTTTATCA 


TTTCTGTCCC 


ACTCCCGTAG 


2340 




AAGTACATCA 


TATAATGCTG 


AAAATGGTTT 


GAGTTAAAAC 


AGATATCAAG 


CTCGTCTGAT 


2400 


15 


TCAGTCACAA 


AATTGTCTTG 


TTATACTTGT 


CACCTATCAT 


CTATAGACCG 


TGGTATGATT 


2460 




AAATTGGGGA 


TGATAAAGGA 


GGTTAATAAA 


TATGAAGATT 


AATACTACAG 


GTGGTCAAAT 


2520 




TCATGGTATT 


ACACAAGATG 


GTTTAGATAT 


CTTCTTAGGC 


ATTCCTTATG 


CAGAACCACC 


2580 


20 


AGTTCATGAC 


AATCGCTTTA AACATTCTAC 


GTTAAAAACA 


CAATGGTCAG 


AGCCAATTGA 


2640 




TGCAACTGAA 


ATACAACCCA TCCCACCGCA ACCAGACAAC AAATTAGAAG ATTTTTTCTC 


2700 


25 


CTCACAATCT 


ACAACTTTTA 


CTGAACATGA 


AGACTGTTTA 


TATCTAAATA 


TTTGGAAACA 


2760 


ACATAATGAT 


CAGACGAAGA 


AACCTGTCAT 


CATTTATTTT 


TATGGTGGTA 


GTTTTGAAAA 


2820 




TGGTCATGGT 


ACAGCCGAAC 


TCTATCAACC 


GGCACATTTA 


GTACAAAATA 


ACGACATTAT 


2880 


30 


CGTTATTACA 


TGCAATTATC 


GTTTAGGCGC 


ATTAGGATAT 


TTAGACTGGT 


CATATTTTAA 


2940 


TAAAGATTTT 


CATTCCAATA 


ATGGCCTTTC 


AGATCAAATC 


AATGTCATAA 


AATGGGTGCA 


3000 




TCAATTTATT 


GAATCCTTCG 


GTGGCGACGC 


TAATAACATT 


ACTTTAATGG 


GTCAGTCTGC 


3060 


35 


AGGCAGTATG 


AGCATTTTGA 


CTTTACTTAA 


AATACCTGAC 


ATTGAGCCAT 


ACTTCCATAA 


3120 


AGTQGTTCTA 


CTAAGTGGCG 


CACTACGATT 


AGACACCCTT 


GAGAGTGCAC 


GCAATAAAGC 


3180 




ACAACATTTC 


CAAAAAATGA 


TGCTCGATTA 


TTTAGATACA 


GATGATGTTA 


CATCATTATC 


3240 


40 


GACAAATGAT 


ATTCTTATGC TGATGGCGAA gcTAAAACAA 


TCTCGAGGAC 


CTTCTAAAGG 


3300 




GCTTGATTTA 


ATATATGCGC 


CTATTAAAAC 


AGATTATATA 


CAAAATAATT 


ATCCAACAAC 


3360 




GAAACCAATT 


TTTGCATGTT 


ATACAAAAGA 


TGAAGGCGAT 


ATTTATATTA 


CTAGTGAACA 


3420 


45 


GAAAAAATTA 


TCGCCGCAAC 


GCTTTATCGA 


CATTATGGAA 


TTAAATGATA 


TTCCTTTAAA 


3480 




ATACGAAGAT 


GTTCAGACGG 


CGAAGcAACA 


ATCTTTAGCG 


ATTACACATT 


GTTATTTCaA 


3540 




ACAGCCGATG 


aAGCAATTTT TACmACmACT 


CAATATACmA 


GATTCCAACC 


GCACCAACTA 


3600 


SO 


TGGCTT 












3606 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 15109 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 

5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

10 

GAAATTAAAA AAGCAATTGG nACAAGATGC AACAGTGTCA TTGTTTGATG AATTTGATAA 60 

AAAATTATAC ACTTACGGCG ATAACTGGGG TCGTGGTGGA GAAGTATTAT ATCAAGCATT 120 

TGGTTTGAAA ATGCAAC9AG AACAACAAAA GTTAACTGCA AAAGCAGGTT GGGCTGAAGT 180 

15 

GAAACAAGAA GAAATTGAAA AATATGCTGG TGATTACATT GTGAGTACAA GTGAAGGTAA 240 

ACCTACACCA GGATACGAAT CAACAAACAT GTGGaAGAAT TTGAAAGCTA CTAAAGAAGG 300 

ACATATTGTT AAAGTTGATG CTGGTACATA CTGGTACAAC GATCCTTATA CATTAGATTT 360 

20 

CATGCGTAAA GATTTAAAAG AmAAATTAAT TAAAGCTGCA AAATAATTCA GCTATATAAG 420 

TTAGTGAAAT GAGAGTCTGA AACATATCAA TCTTTTGATA TTGTATTAGG CTCTTATTTT 480 

TATAGCTAGA AAGTTAGATA TTTGTATTTT TTTAAATAAT AAGTGCCGTT GTTATCGTTC 540 

25 

AATTTAATTA ATGATAGATT AGTATTATTA TAGCTAAAGT AGTATACCTG AGAAAATAGC 600 

TCAATGTATC TCTTTATTAA TAAGTTATAT CATAATTATT TTAGTGCATA CTTTATGGAA 660 

GGGATATCAG GGAATGGCTT TCAATTAAAG AAGAGGTTTA AAAGGATTAC AACAGAATGT 720 

30 

TATGATTTTG TAGAAAGATA TATAACAACG TTTTATAAAA ACATAATATT GTTAATGGAA 780 

AATGAAATGT AAGGGGGATT TCGAGTGACT AAGAAAGTTT ATTTTAACCA CGATGGTGGT 840 

35 GTAGATGATT TAGTATCTCT ATTTTTATTA TTACAAATGG AAAACGTTCA ATTGATAGGG 900 

GTCAGTACAA TTGGTGCTGA TTGTTATTTA GAGCCATCTT TGAGCGCATC AGTAAAAATT 960 

ATTAATCGTT TTTCAAATGA AGATATTCAA GTTGCGCCAT CATATGAACG AGGAAAAAAT 1020 

40 CCATTTCCTA AAGAATGGCG TATGCATGCC TTTTTTATGG ACGCATTGCC AATTTTAAAT 1080 

GAGCCAGTCA AACATGTTGC TTCAAATGTG AGCGACAAAG AAGCCTTTGA AGACATTATT 1140 

CAAACTTTAA AGAGACAATC AGAAAAAGTA ACATTATTAT TTACAGGCCC GCTTACAGAT 1200 

45 TTAGCAAAAG CACTACAAAA AGATTCATCT ATCGTTCAGT ATATAGAAAA ATTAGTTTGG 1260 

ATGGGTGGCA CCTTTTTACC AAAAGGAAAT GTTGAAGAAC CTGAGCATGA TGGTTCTGCA 1320 

GAATGGAATG CATATTGGGA TCCAGAAGCG GTTAAAATTG T TT TTGATAG CGATATAGAG 1380 

50 ATTGATATGG T TGCTTTAGA AAGTACGAAT CAAGTACCGC TAACGTTAGA TGTTAGACAA 1440 
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GTACCACCAT 


TAACACACTT 


TATAACAAAT 


TCTACTTACT 


TTTTATGGGA 


TGTTTTAACG 


1560 




ACTGCTTATA 


TTGGTAACAA 


GGACTTGGTT 


CATTCAATTG 


AGAAAAAAGT 


CGATGTAATA 


1620 


5 


AGTTATGGAC 


CAAGTCAAGG 


TAAGACATTT GAGTGTAAAG ATGGGCGCAA AATTAATGTC 


16B0 




ATAAATCATG 


TAGATAACAA 


CGCATTTTTT 


GATTATATAA 


CTGCACTTGC 


TAAAAAAGTA 


1740 




AATTAACAGC 


TGTGTAGAAT 


AATTAAGGTT 


TTAATTTATA 


TAGAACAACT 


TATTGTAAAC 


1800 


10 


TTTTCATTTC 


TTAAAGTTTA 


CAATGGTGCT 


ATAATAATGG 


TCATGAAATA 


CGAAAGGAAG 


1860 




TAAAAAATGA 


CAACAAAACA 


GTTAGTATAT 


ACAGCTTTAA 


TGACAGCGAT 


TATCGCTATT 


1920 




TTAGGATTGG TACCGGTAAT TCCACTACCA TTTTCTTCAG TACCAATTGT ACTTCAAAAC 


1980 


15 


ATTGGTATTT 


TCTTAGCAGG 


TGCGATTTTA 


GGACGTAAAT 


ATGGCACATT 


AAGTGTTATC 


2040 




GTCTTTTTAT TATTAGTAGT 


TGCTGGCTTG 


CCATTGTTAT 


CAGGTGGTCG 


CGGTGGCATC 


2100 




GGTGTATTCG 


CAGGTCCTTC 


AGCAGGGTTT 


TTACTATTAT 


ATCCAGTTGT 


AGCATTCATG 


2160 


20 


ATTGGGGCGA 


TTCGAGATAG 


ATTCATCAAT 


GAAATTAATT 


TCTGGATTTT 


ATTCGTTGGT 


2220 




ATTTTAGTTT 


TTGGTGTTAT 


AGCATTAGAT 


GTTATTGGTA 


CATTGATTAT 


GGGCATGATT 


2280 




ATTAACATAC 


CATTTACGAA 


AGCTATTTCA 


ATTTCATTAG 


CTTATTTGCC 


TGGTGATATA 


2340 


25 


TTAAAAGCAA 


TTGTAGCAAG 


TTTGATTGGT ACAGCTTTAC 


TTAATCACTC 


GCAGTTTCGT 


2400 




CAAATTATGG 


GAATAAAATA 


ATCATATTTA 


AGATAGTAAA 


GTAATTGAAT 


AAGTTGCTTT 


2460 


■art 


GAAATTTATA 


AAAGTGAAAG 


GAGTAGGTGT 


CAATGGCTAG 


TATAAGTATG 


TCAGATATAT 


2520 


ATTGTAACGG 


CACTATATTT 


GAAAATGACG 


ACGAGCAGTT 


GATTTATTTA ACGCCTTCTT 


2580 




TTCCACAACG 


ATACACAAGT 


AACACATGGA 


TATATAAAAA 


GACGCCTACC 


CAAGAGCGAT 


2640 


35 


GGCTGAAAGA 


CTTAGAACGT 


CAACATCAAT 


TACATACAAA 


TCAAGGTTCA 


AATCATTATG 


2700 


CGTTTAGTTT 


CCCGGAAAAT 


GAACAACTTG 


ATAATCATTG 


GATGGCTATG 


TTTAAAGATA 


2760 




TGAATTTTGA 


ACTAGGTATT 


ATGGAATTGT 


ATGCCATAGA 


AAGTGATGCG 


CTTGCCAATT 


2820 


40 


TGCCGCGTAA 


CTCTGACGTT 


GAAATTGCCA 


TCGTTGACGA 


GTCGCATATA 


GATGCCTATT 


2880 




TAAAAGTTGC 


ATATCAGTTT 


AGTTTGCCAT 


TTGGAAAAGA 


CTATGCAGAT 


GCACATGAAG 


2940 




AAATGGTAAG 


GGAACATTAT 


CAAAAAGATG 


TGATTAAACG 


CTTAGTAGCT 


TATTTAAATA 


3000 


45 


ATGAACCTAT 


TGGCGTTGTA 


GATGTCATTG 


AAAGTGAAAA 


TTACATTGAA 


TTAGATGGAT 


3060 




TTGGTGTATT 


AGAACAATTT 


CGGCACCAAG 


GAATTGGATC 


TACAATTCAA 


TCGTTGATAG 


3120 




GTGAATACGC 


CATATCAAAA 


AATCACAAAC 


CAATCATATT 


AGTTGCAGAT 


GGTGAAGATA 


3180 


' 50 


CAGCAAAAGA 


TATGTATGCA 


AAGCAAGGTT 


ATGTCTATCA 


ATCGTTTTGT 


TATCAAATAT 


3240 
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15 



TAAGCTGGTT TCGAGTAGAA ATCAACTTAC TGCTTTTTAA ATTGTTTTGA GCTACTTATA 3360 

CTTATAAAAA TAGTGCGTTT AAATTGTTGA TTCATGTAGA ATATCGTTCA TTATGACACA 3420 

CTATAATGAA TATGTTATTG TTCAGAATCA ATGATACGTT CTGGATGACT GTATATATTA 3480 

AAGCCACCAT TTCGAATAAA TCCAACTGCC GTAATATTTA GGTCATTAGC TAAGGTTACA 3540 

GCAAGCGTTG TCGGAGCTGA TTTAGATAAA ATGACGCCAA CACCAATTTT TGCGGCTTTA 3600 

ATTAAAATTT CTGATGAAAT ACGTCCACTA AAAATTAATA CTTTATCTCG GACAGTAATA 3660 

TGTCGCTGAA TACAAAATCC ATATAATTTA TCTAGAGCGT TATGTCTACC AATGTCTTGT 3720 

CGATGTACAA AAAATGTCAA ACCATCGCTT ATAGCAGCAT TATGTAAGCC ACCTGTTTCT 3780 

TGGTAAATAT GACTTGCACT TTGTAATCGA GTCATCATGT TAATAATTTG CATTGGAGTT 3840 

AAAGTGATTT TAGACATAGA TGTTTTAGCG ATAGCAGCAT CATTTTGAAA ATAAAACTCA 3900 

20 CGACTCTTTC CGCAACAAGA TGCAATCATT CGTTTTGTGG AATATTGAAA GCGATCGCCT 3960 

AAATCTTTAT TAAGTTCAAC ATGGGCAAAA CCTTTACTAT CATCAATCAG TACAGATTTT 4020 

AATTCATCTC GCTTTAAAAT GGCACCTTCC GAAGCCAGAA ATCCAATGAC TAACTCCTCA 4080 

AGGTTTGTTG GACTGCATAT AACAGTCGCA AATTCTTCAC CATTCACCAT AATTGTAAGT 4140 

GGAAATTCTG TCACATATTG ATCTGTTGTA TTGAATAATT TTCCATCTTC ATATCTAACA 4200 

ATTGGTTGAC CTAAAGATAC ATCTTTGTTC ATTATCTAAC CCCTTTAATT AGCTTAAACT 4260 

TTATTTTAAA GCAATTTGCT TAAAATTTTA ACATATTTGC TTAAGTTTGA AATTTGATTG 4320 

ATAAAAATTA ATAGCGAGCA ATCTGTTTGA TTTAAATTGA ATTCGAGAAT ATACATACTA 4380 

GGGCATCAAT TAATAAATAT CAATCTTATG CAAATTTGAC AATTGTTTGA ATCAATATAT 4440 

AAACAGGCAA CGGTTCTTTT CAAATATAAT AGTAAGTGTA TAATGAAAAT GTAAATATTA 4500 

TTAAAAATGG GGGTTCACTC AATGAAATTG AAACGTTTAT TTGCTGTTGT GATTGCAATG 4560 

40 CTTTTAGTAT TAGCTGGTTG CTCTAATTCT AACGATAATA ATGAAAGTAA AAAAGATGAC 4620 

GCAGACAATG GTAAGAAACA AGAGATTCAA GTTGCAGCGG CAGCAAGTTT AACAGATGTA 4680 

ACCAAGAAAT TAGCTTCAGA ATTTAAAAAA GAGCATAAAA ATGCTGATAT TAAATTTAAC 4740 

45 TATGGTGGAT CAGGGGCATT AAGAAAACAA ATTGAATCAG GCGCACCTGT TGACGTATTT 4800 

ATGTCTGCAA ATACTAAAGA TGTAGATGCA TTAAAAGACA AGAATAAAGC GCATGATACA 4 860 

TATAAATATG CGAAAAATAG TCTAGTATTA ATTGGTGATA AAGATTCAAA TTACACTTCA 4920 

50 

GTAAAAGACT TAAAAGACAA TGATAAATTA GCATTAGGTG AAGTGAAAAC TGTACCAGCA 4980 

GGAAAATATG CGAAACAGTA TTTAGATAAC AATAACTTAT TTAAAGAAGT CGAAAGTAAA 5040 
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CAAGGTTTTG TGTATAAAAC TGACTTATAT 
GTAATTAAAG AAGTAGAACT TAAGAAGCCA 

5 AGTAAATTAG CAAAAGAGTG GATGGAATTC 

AAAGAATACC ACTTTGCAGC ATAAGGAGTT 
GATATCAATA CGAGTTGCTG TAATCAGTAC 

10 ATCTAAATGG TTGTATCGTC GTAAGGGTTC 

ATTACCTATT GTTTTGCCGC CAACGGTATT 
AAGAGGACCA ATCGGTCAAT TCTTTGCGAA 

75 

GACAGGTGCT GTGATAGCAT CTGTCATTGT 
GCAAGGCTTC AGAGGTATAG ACACGAAAAT 
TGAAACGAAA ATTTTCCTCA AATTAATTTT 

20 

TATAATGATG AGTTTTGCTC GTGCATTAGG 
ATATATTCCA AATAAAACGA ATACACTACC 

2s TAGAGAAAAT GAAGCGTGGT TATGGGTATT 
ATCTACAATT AATTTATTGA ATAAAGATAA 
CAATGTGAAA TATCAATTAA AGAACACTTT 

30 AAAAATTTAT GCAGTTCGTG GTCCATCTGG 
TGCCGGATTA CGTAAAGCAG ATGAAGCTAT 
TACGGCAAAA AACGTGAATG TTAAAATTCA 

35 CTACCAATTG TTTCCTAATA TGACGGTCTA 
TGAACACATC GATCAATTAA TTCAAACTTT 
TATGACATTG TCAGGTGGAG AGGCACAACG 

40 

ACCAGATTTA ATTTTATTAG ATGAACCTTT 
GAGTATTACA TTAGTTAAAC GTATTTTCAA 
ACATTCAAAC TATGAAGCAG AACAAATGGC 

45 

TATTTGCCAT TAAAGAGTTT AGAACGTATT 
CATTTTAATG ATGTTTTAAA CTCTTTTTTA 
SO CGTCATATAA TGAAAGTAAT GATAAAAAGA 
ATTCAAGGCA AATTTTATTT AAACAAATAG 

55 



AAACAAAATA AAAAAATTGA TACTGTAAAA 5160 

ATCACATACG AAGCTGGTGC TACATCAGAT 5220 

TTAAAATCAG ATAAAGCTAA AGAAATACTA 5280 

GTAATCCATG CCTGACTTAA CACCTTTTTG 5340 

GATTATTGTA ACGGTTTTAG GTATTTTTAT 5400. 

GTGGGTTAAA GTATTGGAAA GTTTATTGAT 5460 

AGGTTTTATT CTATTAATCA TCTTCTCGCC 5520 

TGTACTACAT TTACCTGTAG TGTTCACTTT 5580 

TAGTTTTCCA CTAATGTATC AACATACTGT 5640 

GATTAATACA GCTAGAACGA TGGGAGCAAG 5700 

ACCATTAGCT AAACGCTCTA TTTTAGCAGG 5760 

TGAGTTTGGT GCTACATTAA TGGTTGCAGG 5820 

TTTAGAAATA TACTTCTTAG TGGAACAAGG 5880 

AGTGCTAGTC GCATTCTCTA TTGTGGTTAT 5940 

ATATAAGGAG GTCGACTAGA TGCTTAAAAT 6000 

AATTCGCATC AATATAGATG ATACTGAACC 6060 

CATTGGTAAA ACTACTGTTT TAAATATGAT 6120 

TATCGAAGTG AATGGGCAAT TACTTACTGA 6180 

ACAACGACGT ATTGGATATC TGTTTCAAGA 6240 

TAAAAATATT ACTTTTATGG CTGAACCATC 6300 

AAACATTGAT CATTTGATGA AACAATATCC 6360 

TGTAGCACTT GCACGTGCAC TTAGCACrAA 6420 

TTCTAGTTTG GATGATACTA CAAAAGATGA 6480 

CGAATGGCAA ATACCAATCA TATTTGTGAC 6540 

TCATGAAATT ATTACAATTG GGTAATCATT 6600 

TAAAATTGTA GAAGTGAATG CTTCTATCAG 6660 

GGGGCAGTTT TTTTGAGAGA CATTGACGCG 6720 

AAGGATAACT TAATGTGAGT CAAGAACGTT 6780 

GTGAAATAGG TCAAAGCAAA ATAAATCAAA 6840 
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GAGCAGGCAT 


TGCCAAACTA 


ATCATTGTTG 


ATAGAGATTA 


TATTGAATTT 


AGTAATTTAC 


6960 




AAAGACAAAC 


ATTGTTTACT 


GAAGAAGATG 


CTTTGAAAAT 


GATGCCTAAG 


GTGGTTGCAG 


7020 


5 


CTAAAAAGCA 


TTTGCTAGCG 


TTACGTAGTG 


ATGTTGATAT 


TGATGATTAT ATTGCCCATG 


7080 




. TGGATTATTA TTTTTTGGAA ACACATGGAC AGGACGTTGA CGTTATTATT GATGCAACCG 


7140 


10 


ATAACTTTGA 


nn\» nv» \jn Litn 


era a tt a a *m 


A. 1 1 M 1 "1 ** I V lp» fTIJl 

AX 1 1 XuVJAXA 


lAnnlnlLul 


A1ALL1 ILtW\ 


7200 


TTTA'TYVVTYIrt 
X X IniWJiUU 




Ala 1 Al_A 1 A X A 


LAlxAAvjC 1 VjL- 


ATTTATAC CT 


^^*mw 7\ iv iv ^iv 

GGTAAAACAC 


7260 




f*TTf2 PTTT A A 




LLALAAl lVJV, 


f* A f^/" 1 IN IT* A IV TV 

C-ACjCATTAAA 


TTTAACATGT 


GATACAGTAG 


7320 


15 


uuulLnl XV_A 


ALU 1 oL-L-Aj 1 u 


A tAa A 1 UOCAA 


f'A A 1 M I ■ 1 *7t TV 

CAAGTTTACA 


% mm iv k ^ * m 

ATTAAGAGAT 


GCGATGAAAG 


7380 








TV A A A IV 7\ 


TAACTTATGG 


CGATATTTGG 


GAAGGTAGTC 


7440 




TATTHTTPI^TT 
nl XAX 1 L-Al 1 


X Iju 111 Lnu I 


AAAAloL-AAL 


GTTCZAGACTG 


TACAACTTGT 


GGAGATGTAC 


7500 


20 


Uviw 1 1 A 1 


PTR TTT'TA TV n r 1 
U1A1 1 1AAAL 


AAGAATGAAC 


AACGTTATG C 


AACATTGTGT 


GGTAGAGACA 


7560 




u i\j i aua(j i a 




T/" 1 A A mm jv 

TCAATTACAC 


ACGACATTCT 


TGTTCAATTT 


TTAAAACAAC 


7620 




All-Ala 1 


TT A T 7A P T 


AATTCGTATA 


TGGTTATGTT 


TGAATTTAAA 


GGACACCGCA 


7680 


25 


mm^ ■ i - ■ v - r — i " i i ■ 
X XVJX.Xl7l- XXX 


T A & Af^T^/IA 


Awl ill 1AA 


1 AuATLtU cat 


UACACGCACA 


TCAGATGCCA 


7740 




UIUa X If X An X 


Unnl X 1a1 lu 


T v T ,r Pr2f2 7A T A. A & 
X 1 llJlzAlAAA 


AAAAlxAl AAl» 


ACAAAAGGAG 


TGTaATATTA 


7800 




TGGGCGAACA 


x Uvinn^u x 1 


A A * TT(7 A A t r* 
nnnl 1 unn X L. 


C2T A r 1 A P TT A A 
ulnLnul 1AA 


Atj LlAvj L CGT A 


CTAACGGTAT 


7860 


30 


CAGATACTAG 


AGACTTTGAT 

nVJn\» XXX On 1 


nLnun X nnnu 


fSTYIfSTr'A ATC2 
0 X Lnnlu 


OfSTrTTT'PTi & 
Lu X uLuLtM 


L1A11 AL-AALj 


7920 




CAGATGACGT 


TGAAGTGAGT 


GACGCACATT 


ATA C* A ATTRT 
nxni*nnx lui 


HAAA^ATYZAA 
unnnun X V? AA 


a A AfJT arrr a 

AAALa X AOL. LA 


7980 


35 


TCACGACGCA 


GGTGAAGAAG 


TGGTTAGAAG 


AAGATATTGA 


TGTCATCATT 

X V7 X Wtl wl X X 


/\LV7AV» X VJO X VJ 


8040 


GAACAGGTAT 


TGCACAACGT 


GATGTGACGA 


TTGAAGCAGT 


AAAAPPArTT 

**r\*v\\~ <~r\\~ X X 


TTAAr*TAAAfl 
X XAALXAnAu 


8100 




AGAlAGAAGG 


CTTTGGGGAA 


TTGTTTAGAT 


ATTTGAGTTA 


TGTTGAAGAT 


GTTGGCACGC 


8160 


40 


GTGCATTATT 


GTCTCGTGCT 


GTAGCAGGTA 


CAGTTAATAA 


TAAATTGATA 


TTTTCGAxTC 


8220 




CAGGATCAAC 


AGGCGCAGTT 


AAATTAGCAT 


TAGAAAAGCT 


CATTAAACCA 


GAATTAAATC 


8280 




ATCTGATTCA 


TGAGCTTACA 


AAATAATTTA 


TTGATTTGAT 


TGGCGTTGAA 


AATCTCCAGA 


8340 


45 


TTTACCGCCA 


GACTTGCTTT 


CAAGGTAGGT 


TTCGCCAATA 


ATCATACCTT 


TATCAACTGC 


8400 




TTTCGTCATG 


TCGTAAATGG 


TTAAAGCCGT 


TGCTGATGCA 


GCGGTTAAAG 


CTTCCATTTC 


8460 




AACACCGGTT 


TTGCCAGTTG 


TAGAGACAGT 


TGTTTGAATG 


TTTAAAGTAT 


AAAGGGGTGC 


8520 


SO 


ATTTGTTTCA 


TCCCAGCTGA 


AGTGAACATC 


TATGCCAGTC 


AATGGTAATG 


GATGGCACAT 


8580 




CGGAATAAGT 


GTTGATGTAT 


TTTTGGCAGC 


CATAATACCA GCGATTTGAG 


CAGTGTTCAA 


8640 
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AATGCTTGAA 


TGAGCGACAG 


CAGTTCTTTT 


TGTAATTTGT 


TTGTCTGATA 


CATCGACCAT 


8760 




TTTGGCGTGG 


CCTTGTTGAT 


TAATATGAGT 


AAACTCAGTC 


ATTTTACCCC 


TCCTAGTGCA 


8820 


5 


TCTAGTATAT 


CATGAAAAAA 


TAAAAGTTTT 


GGAGATGATT 


TTTAATGGTA 


GTAGAAAAAA 


8880 




GAAACCCAAT 


CCCAGTTAAA 


GAAGCAATTC 


AACGTATCGT 


TAATCAGCAG 


AGTTCAATGC 


8940 


10 


CGGCAATTAC 


GGTAGCACTT 


GAAAAAAGTC 


TAAATCATAT 


CTTAGCAGAA GATATTGTAG 


9000 


CTACTTATGA TATACCAAGG 


TTTGATAAAT 


CACCTTATGA 


TGGTTTTGCA 


ATTCGCAGTG 


9060 




TTGATTCACA 


AGGGGCAAGT 


GGTCAGAATC 


GCATTGAGTT 


TAAAGTGATT 


GATCATATTG . 


9120 


15 


GTGCAGGTTC AGTTTCTGAT AAATTAGTTG GGGATCACGA AGCGGTGCGT ATTATGACTG 


9180 


GAGCACAAAT 


ACCTAATGGC 


GCAGATGCTG 


TTGTTATGTT 


TGAACAAACG 


ATTGAACTAG 


9240 




AAGATACATT 


TACAATTCGT 


AAACCATTTT 


CAAAAAATGA AAATATATCT 


TTAAAAGGTG 


9300 


20 


AAGAAACAAA 


GACAGGCGAT 


GTTGTTCTAA AAAAAGGACA AGTAATTAAT 


CCAGGGGCTA 


9360 




TCGCGGTCCT 


TGCAACATAT 


GGCTATGCAG 


AGGTTAAAGT 


TATTAAGCAA 


CCGAGTGTCG 


9420 




CTGTTATTGC 


AACAGGAAGC 


GAATTATTAG 


ATGTTAATGA 


TGTATTAGAA 


GATGGGAAAA 


94B0 


25 


TTCGTAACTC 


TAATGGCCCA 


ATGATTCGTG 


CCTTAGCAGA 


AAAATTAGGT 


CTTGAAGTTG 


9540 




GTATTTACAA 


AACACAAAAA 


GATGATTTAG 


ATAGTGGCAT 


CCAAGTCGTT 


AAAGAAGCTA 


9600 




TGGAAAAACA 


TGATATCGTT 


ATTACAACGG 


GCGGAGTTTC 


TGTTGGAGAT 


TTTGACTATT 


9660 


30 


TACCTGAGAT 


TTATAAGGCT 


GTAAAGGCGG 


AAGTGTTATT 


TAATAAAGTA 


GCAATGCGTC 


9720 




CTGGTAGCGT AACAACGGTT 


GCATTTGTAG 


ATGGaAAGTA TTTGTTTGGa 


TTATCTGGAA 


9780 




ATCCATCAGC TTGTTTTACA GGATTTGAAC TATTTGTGAA nCCAGCTGTT 


AAACATATGT 


9840 


35 


GTGGCGCACT 


AGAAGTCTTC 


CCGCAAATAA 


TTAAAGCAAC 


ATTAATGGAA 


GATTTTACCA 


9900 




AGG^AACCC 


ATTCACACGA 


TTTATACGTG 


CTAAAGCAAC 


GTTAACAAGT 


GCTGGAGCTA 


9960 


40 


CTGTAGTACC 


TTCAGGATTC 


AATAAATCAG 


GTGCGGTTGT 


AGCGATTGCA 


CATGCTAACT 


10020 


GTATGGTCAT 


GTTACCAGGA 


GGGTCACGTG 


GTTTTAAAGC 


GGGGCATACA 


GTAGATATTA 


10080 




TATTGACTGA 


ATCTGACGCT 


GCTGAAGAGG 


AACTTCTTTT 


ATGATTTTAC 


AAATTGTAGG 


10140 


45 


TTACAAAAAG 


TCTGGTAAGA 


CAACATTGAT 


GAGGCATATT 


GTCTCTTTCT 


TAAAGTCACA 


10200 




TGGTTATACA 


GTTGCTACTA 


TTAAACATCA 


TGGGCATGGT 


AAGGAAGATA 


TTCAATTACA 


10260 




GGATTCAGAC 


GTCGATCACA 


TGAAGCATTT 


TGAAGCGGGG 


GCAGATCAAA 


GTATTGTACA 


10320 


50 


AGGTTTTCAA 


TATCAGCAAA 


CTGTAACACG 


TGTAGATAAT 


CAAAATCTTA 


CTCAAATTAT 


10380 




TGAAAAATCT 


GTTACAATTG 


ACACCAATAT 


CGTATTAGTT 


GAAGGCTTTA 


AAAATGCTGA 


10440 
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GAATGTTTGT TATAGCATTA ATGTAAGGGA GCATGAAGAT TTTACAGCAT TTGAGCAATG 10560 

GTTATTAAAT AAAATTAAAA ATGATTGTGA TACACAATTA ACATAGAGGA TTGAAATGAA 10620 

TGAAACAATT TGAAATCGTG ACAGAACCGA TACAAACAGA ACAATATCGT GAATTCACTA 10680 

TAAATGAATA TCAAGGTGCA GTAGTTGTTT TTACCGGTCA TGTTCGCGAA TGGACTAAAG 10740 

GCGTCAAAAC GGAATATTTA GAATATGAAG CGTATATTCC AATGGCTGAA AAGAAATTGG 10800 

CACAAATTGG AGATGAAATA AATGAAAAAT GGCCTGGAAC GATAACGAGT ATTGTTCATA 10B60 

GAATAGGGCC ATTACAAATT TCAGATATCG CTGTATTAAT TGCGGTTTCT TCACCGCATC 10920 

GTAAAGATGC CTATCGAGCA AATGAATATG CAATTGAGCG TATAAAAGAA ATTGTTCCGA 10980 

TTTGGAAAAA AGAAATTTGG GAAGATGGTT CAAAATGGCA AGGGCATCAA AAAGGGAATT 11040 

ATGAAGAAGC AAAG AGGGAG GAATAAGAGA GATGAAGGTA CTTTACTTCG CAGAAATTAA 11100 

AGATATATTA CAAAAAGCAC AGGAAGATAT TGTGCTTGAA CAAGCATTGA CTGTACAACA 11160 

ATTTGAAGAT TTATTGTTTG AACGTTATCC GCAAATCAAT AATAAAAAGT TTCAAGTTGC 11220 

TGTAAATGAG GAATTTGTAC AAAAATCGGA TTTCATTCAA CCTAATGATA CTGTTGCATT 11280 

AATTCCACCG GTTAGTGGAG GTTAAGGGAG CATGAAAGCA ATAATTCTTG CAGGTGGTCA 11340 

TTCAGTGCGA TTTGGTAAGC CCAAAGCTTT TGCGGAAGTG AACGGTGAGA CCTTTTATAG 11400 

TAGAGTAATT AAGACATTAG AATCAACAAA TATGTTCAAT GAAATTATTA TTAGTACAAA 11460 

TGCGCAATTG GCAACGCAAT TTAAATATCC AAATGTTGTT ATAGATGATG AGAATCATAA 11520 

TGATAAAGGT CCATTAGCAG GAATTTATAC AATCATGAAG CAACATCCTG AAGAAGAATT 11580 

GTTTTTTGTC GTTTCTGTTG ATACACCAAT GATTACTGGT AAAGCTGTAA GCACGTTGTA 11640 

TCAGTTTTTA GTTTCTCATC TTATTGAAAA TCATTTAGAT GTCGCAGCTT TTAAAGAAGA 11700 

TGGACGTTTT ATTCCAACAA TTGCATTTTA TAGTCCGAAT GCATTAGGCG CTATAACTAA 11760 

AGCACTACAT TCTGATAATT ACAGTTTTAA AAATGTATAT CATGAATTAT CAACGGATTA 11820 

TTTGGATGTA AGGGATGTAG ATGCGCCCTC ATATTGGTAC AAAAATATAA ATTATCAGCA 11880 

TGATTTGGAC GCTTTAATTC AAAAATTGTA AGCTGTTAGG AGGTCCACAA ATGGTAGAAC 11940 

AAATAAAAGA TAAACTAGGA CGTCCCATCC GTGACTTACG GTTATCTGTG ACAGATCGGT 12000 
GTAACTTTAG GTGTGATTAT TGCATGCCTA AAGAGGTATT TGGAGATGAT TTCGTATTTT , 12 060 

TACCTAAAAA TGAACTTTTA ACGTTTGATG AAATGGCTAG AATCGCTAAG GTATATGCAG 12120 

AATTAGGTGT AAAAAAAATA CGCATTACAG GTGGAGAACC ATTGATGCGA CGGGATTTAG 12180 

ATGTACTTAT AGCTAAATTA AATCAAATCG ATGGTATTGA AGATATTGGT TTGACTACAA 12240 



55 



367 



EP0 786 519 A2 





ATGTCAGTTT 


GGATGCTATT 


GATGATACGC 


TATTTCAATC 


AATCAATAAT 


CGTAATATTA 


12360 




AAGCGACTAC 


GATTTTAGAA 


CAAATTGATT 


ACGCGACGTC 


TATTGGTTTG 


AATGTAAAAG 


12420 


£ 


TAAATGTTGT 


TATACAAAAA 


GGTATTAACG 


ATGATCAAAT 


CATACCAATG 


CTTGAATATT 


12480 




TTAAAGATAA 


ACATATAGAG 


ATTCGATTTA 


TAGAATTTAT 


GGATGTTGGT 


AATGATAATG 


12540 




GATGGGATTT 


CAGTAAAGTT 


GTAACTAAAG 


ATGAAATGCT 


TACAATGATA 


GAGCAGCACT 


12600 


10 


TTGAAATCGA 


TCCTGTAGAA 


CCAAAATATT 


TTGGGGAAGT 


AGCAAAATAT 


TATCGCCATA 


12660 




AGGATAATGG 


TGTTCAATTT 


GGTTTGATTA 


CAAGTGTTTC 


ACAATCATTT 


TGTTCTACAT 


12720 


15 


GTACACGCGC 


AAGGCTGTCA 


TCAGATGGGA 


AGTTTTACGG 


ATGTTTATTT 


GCAACTGTCG 


12780 


ATGGATTTAA CGTTAAAGCG TTTATTCGTT 


CTGGCGTGAC 


CGACGAAGAA 


TTAAAAGAAC 


12840 




AATTTAAAGC 


TTTATGGCAA 


ATAAGAGATG 


ATCGATATTC 


AGATGAGAGA 


ACTGCTCAAA 


12900 


20 


CAGTTGCCAA 


TCGTCAACGT 


AAAAAGATAA 


ACATGAATTA 


TATTGGTGGT 


TAATGTGTAG 


12960 


GGACCACTAC 


ATATTAAATC 


ATTAGAGATG 


TTTTAATATT 


TCTGTCTTAC 


TCCCTAAAAT 


13020 




ACAATATTAT 


TTATTAAAGT 


AAAAACGGTC 


ATATCTATGC 


CAGATTTAAT 


AGAAATGATC 


13080 


25 


GTTTTTAAAG 


TTTTTACAAG 


TTGGCGGGGC 


CCCAACACAG 


AAGCTGACAG 


AAAGTCAGCT 


13140 




TACAATAATG 


TGCAAGTTGG 


CGGGGCCCCA 


ACATAGAGAA 


TTTCAAAAAG 


AAATTCTACA 


13200 




GACAATGCAA 


GTTGGGGAAC 


GGGGCCCCAA 


CACAGAAGGT 


GACGAAAAGT 


CAGCATACAA 


13260 


30 


TAATGTGCAA 


GTTGGCGGGG 


CCCCAACATA 


GAGAATTTCA 


AAAGAAATTC 


TACAGACAAT 


13320 




GCAAGTTGGG 


GATCAACGAA 


ATAAATTTTA 


TGAGAATATC 


ATTTCTATCC 


CACTCTTAAG 


13380 




AATCACTACA 


TAATAAATCT 


TTAGTGGTTC 


TTTAACATTG 


ATGTCACACT 


CCATGCCATT 


13440 


35 


GAGTTGTAAT 


ATATCTTTTT 


TAGGTATAAA 


TGTTGTCGAA 


TAAACAACAA 


GTTGTCCAAA 


13500 




AGATATAAAT 


CTAAACAAGA 


TATAGCCAGC 


AATTTAATAT 


TTGTAATAGA 


TAAAATGCTA 


13560 




AGTTTGATAT 


ATAATAAATT 


TAAGTAATTG 


TATAATAATA 


TGAATTACAA ACATCTAAGA 


13620 


40 


AGAAACATAG 


GAGGCATCAT 


ATTATGAGTA 


ATAAAGTTCA 


ACGTTTTATA 


GAAGCAGAAA 


13680 




GGGAGTTAAG 


TCAGTTAAAG 


CACTGGTTAA 


AAACAACACA 


TAAGATTTCA ATTGAAGAAT 


13740 


45 


TTGTAGTCCT 


TTTTAAAGTG 


TATGAAGCTG 


AAAAGATTAG 


CGGTAAAGAA 


TTGAGGGATm 


13800 


CATTACATTT 


TGAAATGCTA 


TGGGATACAA 


GTAAAATCGA 


TGTGATTATC 


CGTAAAaTCT 


13860 




ATAAAAAAGA 


GCTTATTTCT 


AAATTGCGTT 


CTGAAACGGA 


TGAAAGACAA 


GTATTCTATT 


13920 


50 


TCTATAGTAC 


TTCTCAAAAG 


AAATTGTTAG 


ATAAAATTAC 


TAAAGAAATA 


GAAGTGTTAA 


13980 




GCGTTACAAA CTAAAAACTT aAAAAgcaTG 


CCAATCTCTA 


TTCATCATAA 


TTGCGTCTTG 


14040 
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GTTCATGGCA TTTCTAGTTA CATGACGTCC ATGAATTAAG AAGTAAACAA GCATAGTAAT 14160 

GATTGCTAAA GCGGCCATAA AGCCGAAGAT TTCACTATAT GAAAACATAT GAGTAAATAA 14220 

CCCAAGGAAT GATGGACCGA AGCCGACACC TGCATCTAGA CCAACGTAAA AAGTAGATGT 14280 

CGCGATACCA TATTTAATCG GGGGTGAGAC TTTTATCGCA ATAGATTGCA TTGCAGATGA 14340 

TAAATTTCCA TACCCTAAAC CTAGGCAAGC ACCAGCAAGT AATATTAACC AGCTTTGATA 144 00 

GCTTGAAATT AAGCATACAA ATGAAAGGAA AAGCATGATA AATGCTGGGT AGACAATAAT 14460 

ATTTTCATTT TTATCATCCA TCAATCTACC AGCAATAGGT CTAGTAATTA ACGATGCTAT 14520 

AGCATAGCAA ATAAAGAAAT AGCTTGCTGC AGTGACTAGG TGTCGCTCTA AAGCAAATGC 14580 

TTGTAAATAA GTTAGGATGG ACGCATAGGT AACGCCAATT AAAAGCATAA TTACAGCAAC 14640 

AGGAATGGCC TCTTTTGCAA TAAATTGATG AATACTAAAT CTTGGTTTAT CAATGACATT 14 700 

AGTTTCAGTT TTGTTATTTG TTACTTCGAA ATCAACTTTT ATAAATAATG AGATAATGAG 14760 

TCCGAGTATG CCTAATATGA CACAAATAAT AAACAGTAAG TCAATTGCGT ATTTTGTAAT 14 820 

AAGTAACATG CCTAGAAATG GGCCAATCGC TGTACCTAAT ACTAAACTTA AGGAAAATAA 14 880 

ACTGATGCCT TCACTTTTTC TATTAACAGG GGTAACGTAT GCCGCAATAG TACCTGTTGC 14 940 

AGTTGTCACA ACTGCAGTTG CGATACCGTT TATGAGACGT ACAAAGATTA AAAAAGCTAA 15000 

AGATCCATCA ATAAAATAAA GTAATTGCGT GATAATTAAA GCAATTAAAC CAATAAATAA 15060 

TAATCGTTTA GGTCCrATTT sATTTACAAA TTTACCTGTA GCAAATCGA 15109 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9072 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GAGAGTCAAT GGCAAGAAGA ATATAAATAT TTGAGAGCGT TAATCTTTAA TGAAACAGAA 60 

TTAGAGGAAG CGTATAAATG GATGCATCCT TGTTACACGT TGAATAATAA AAATGTAGTA 120 

CTTATCCATG GCTTCAAAAA TTATGTTGCA CTATTATTTC ATAAAGGTGC CATTTTGGAG ' 180 

GATAAATATC ATACACTCAT TCAACAGACT GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 240 

TTTGAAAATT TAACAGAGAT TCAAGCACGT ACCGAAGAAA TTAAATATTA TCTAGCCGAA 300 

GCAATTAAAG CTGAAAAAGC TGGTAAAAAA GTTGAAATGA AGAAAACAGA GGAATATGTT 360 
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AAATTAACGC CAGGCAGACA ACATCAATAT ATATATCATA TTGGACAAGC TAAACGCAgT 4 BO 

GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 540 

CATGATAAGT AATTAATGAG TAAAGCATAC CGGTTATACA ACAACATACA AGATGACACG 600 

AAACAACCAA TGGCTCATGC TGTTGGTTGT TTTTTTAGGT GTGTCTGTCA TGGGCAACAC 660 

TTTGACGTTG GAATTCCGTT ACAGGCTTGG GAGTAGAAAA TGTTAGCAAA AGGCAAGGGT 720 

GTCTACAATG AATGATGAAG ATATTAAAAT ATAAGGATGA CTTTGTGAGT GGCGGATGGG 780 

CGGTTGTCCG TCTGTAACAA TGGATGCGTG TGCATTATTA CAAAAATTCG ACTTTTGTAA 840 

TAATATTTCA CATTTTCGAC ACTTTTTTGC TATAAAACAA CCAATTGAGC GATAATAAAT 900 

TCGCTTTTAA AAAATATGAG TTATCTATTT AGTTGCCAAA GATAAAATAA TAATGTTTAA 960 

TAACATCATA TAGAGTATGT TAGTTTTAAA TGTCGAATAT ACGAATGTGc AAACAAAGTA 1020 

ATCGGTAGAA ATTCAACATA CATAGCGCCG TTTACTGTTA AGTATTCACA TTACAGATGA 1080 

AAAATATAAA ATTCTACATA ATCAAGACCA TGATGTGTAC TTGTTTAACT TATGACTCTA 1140 

TTTGTTTAAC AATTGCGATA ATGGTCTTTT TATTTTATGC GTATCATTCG TCATATTTTT 1200 

TATGAGGAAG GAGAAATGAT TATGTTAAGT ATTAAGCATT TAACGAAAAT TTATTCTGGT 1260 

AATAAAAAGG CAGTAGATGA CATCTCTTTA GATATTCAAT CTGGGGAATT TATCGCATTT 1320 

ATTGGAACCA GTGGAAGTGG CAAAACGACT GCTTTAAGAA TGATAAACCG TATGATTGAA 13 80 

30 GCGACAGAAG GACAAATTGA AATTGATGGT AAAGATGTTC GGAGTATGAA TCCTGTCGAA 1440 

TTGCGTAGAA ATATTGGCTA TGTTATTCAA CAAATTGGCT TAATGCCTCA TATGACGATT 15 00 

AAAGAGAATA TTGTGTTGGT ACCCAAATTG TTGAAATGGA CTAAAGAGGA AAAGGATAAA 1560 

35 CGTGCAAAGG AATTAATTAA ACTTGTGGAT TTACCGGAGT CATTTTTAGA GCGTTATCCA 1620 

GCAQAACTAT CAGGTGGGCA ACAACAACGT ATCGGTGTTG TAAGAGCACT TGCGGCCGAA 16 80 

CAAGATATTA TTTTAATGGA TGAACCTTTT GGTGCATTGG ATCCTATTAC GAGAGATACG 1740 

TTACAAGATT TAGTTAAAAC GTTACAACGA AAATTAGGCA AGACGTTTAT CTTTGTAACA 18 00 

CATGATATGG ATGAAGCGAT TAAATTAGCA GACAAAATTT GTATTATGTC AGAAGGTAAG 1860 

GTGGTGCAAT TTGATACGCC AGACAATATT TTAAGACATC CCGCAAATGA TTTTGTACGT 1920 

GATTTTATAG GACAAAATAG ACTGATTCAA GACCGTCCCA ATGACAAGAC TGTAGAAGGT 1980 

GTAATGATTA AACCAATCAC GATACAAGCA GAAGCAACAC TGAATGACGC CGTTCATATT 2040 

ATGAGACAAA AACGTGTTGA TACTATTTTT GTAGTAGATA GTAATAACCA TTTACTAGGT 2100 

TTCTTAGACA TTGAAGATAT AAATCAGGGT ATACGTGGAC ACAAAAGTTT ACGAGACACC 2160 
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ATTTTAAAAA GAAACGTTAG GAATGTACCT GTCGTAGATG ATCAACAGCG TTTAGTAGGA 2280 

CTGATTACGC GTGCCAATGT TGTTGATATT GTATATGACA CGATTTGGGG CGATAGTGAG 2340 

GATACAGTGC AAACAGAACA TGTGGGGGAA GACAcTGCGT CCTCAAAAGT GCATGAGCAA 24 00 

CACACTACTA ATGTCAAAGT ACGTGACATA GGAGATGATA AATCATGATT GAGTTCCTAC 2460 

ATGAACATGG TGGACAGTTG ATGTCGAAAA CACTGGAACA TTTCTATATT TCTATAGTGG 2520 

CATTATTACT TGCCATCATT GTTGCAGTAC CTATAGGCAT TTTATTATCA AAAACAAAGC 2580 

GAACTGCCAA TATTGTATTA ACTGTGGCAG GTGTCTTACA AACTATTCCA ACACTAGCTG 2640 

TACTTGCTAT TATGATACCG ATTTTTGGTG TTGGTAAAAC GCCTGCAATT GTAGCGCTAT 2700 

TTATTTATGT ATTATTACCT ATTTTAAATA ACACGGTACT CGGTGTTCAA AATATTGATA 2760 

GCAACATTAA AGAAGCTGGA AAAAGTATGG GAATGACACA ATTTCAATTG ATGAAGGATG 2820 

2Q TTGAATTGCC GTTAGCATTG CCGCTTATCA TTGGTGGCAT TCGTTTGTCA TCTGTGTATG 2880 

TAATTAGTTG GGCTACACTT GCAAGTTATG TAGGTGCGGG TGGATTAGGT GATTTCATTT 294 0 

TCAATGGTTT AAATTTATAT GATCCACTGA TGATTGTAAC TGCAACGGTA CTCGTTACTG 3000 

2S CACTAGCATT AGGTGTTGAT GCCTTATTAG CTTTAGTTGA AAAATGGGTA GTTCCCAAAG 3 060 

GCTTAAAAGT ATCTGGATAA TTAGGAGGCT AAGATAATGA AGAAAATTAA ATATATACTT 3120 

GTCGTGTTTG TCTTATCGCT TACCGTATTA TCTGGATGTA GTTTGCCCGG ACTAGGTAGT 318 0 

30 AAGAGCACGA AAAATGATGT CAAAATTACA GCATTATCAA CAAGCGAATC GCAAATTATT 3240 

TCACATATGT TACGGTTGTT AATAGAGCAT GATACACACG GTAAGATAAA GCCAACATTA 3300 

GTAAATAATT TAGGGTCAAG TACGATTCAA CATAATGCCT TAATTAATGG GGATGCTAAT 3360 

ATATCAGGTG TTAGATATAA TGGCACAGAT TTAACGGGAG CTTTGAAGGA AGCACCAATT 3420 

AAAAATCCTA AGAAAGCAAT GATAGCAACA CAACAAGGAT TTAAAAAGAA ATTTGATCAA 3480 

ACGfTTTTTG ATTCGTATGG TTTTGCGAAT ACGTATGCAT TCATGGTAAC GAAGGAAACC 3540 

GCTAAAAAAT ATCATTTAGA GACAGTTTCA GATTTAGCAA AGCATAGTAA AGATTTACGT 3600 

TTAGGTATGG ATAGTTCATG GATGAATCGT AAAGGCGATG GCTATGAAGG ATTTAAAAAA 3660 

GAGTATGGTT TTGACTTTGG TACAGTGAGA CCAATGCAAA TAGGTCTAGT CTACGACGCA 3 720 

TTAAACTCAG AGAAGTTAGA CGTTGCATTA GGTTATTCTA CAGATGGTCG AATTGCGGCG 3780 

TATGATTTGA AAGTACTTAA AGATGATAAA CAATTTTTCC CACCTTATGC TGCGAGTGCT 3B40 

GTTGCAACAA ATGAATTATT ACGGCAACAC CCAGAACTTA AAACGACGAT TAATAAGTTG 3900 

ACAGGAAAGA TTTCGACTTC AGAGATGCAA CGCTTGAATT ATGAAGCGGA TGGTAAAGGT 3 960 
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AAAGGTGGTC ATAAGTAATG GAAGGTAATT TATTACAGCA ATTATTCAAT TATTATGTTA 4080 

CGAACTTTGG TTATCTATGG GATTTATTTT TCAAACACTT ATTAATGTCT GTCTATGGTG 4140 

TGCTGTTTGC AgCTTTAATT GGTATTCCAT TGGGAATCTT GCTTGCaAGA TACACAAAAC 4200 

7TTCTGGATT TGTAATTACA ATTGCAAATA TAATTCAAAC AGTTCCAGTC ATTGCAATGT 4260 

TAGCTATTTT AATGTTAGTC ATGGGCTTAG GTTCAGAAAC AGTAGTTTTA ACAGTGTTTT 4320 

TATATGCGTT ACTTCCAATT ATAAAAAACA CTTATACTGG TATAGCTAGT GTTGATGCGA 4380 

ATATTAAGGA TGCTGGCAAA GGTATGGGAA TGACACGCAA TCAAGTGCTA CGAATGATTG 4440 

AATTACCGTT ATCTGTTTCG GTTATTATCG GTGGCATTCG TATTGCCTTG GTTGTTGCGA 4500 

TAGGTGTTGT TGCCGTTGGA TCATTTATAG GAGCACCTAC GCTTGGTGAC ATTGTGATTC 4560 

GTGGTACAAA TGCGACGGAT GGCACAACGT TTATTTTAGC AGGTGCGATT CCGATTGCTA 4620 

TCATTGCAAT CGTCATTGAT GTACTATTAA GATTTTTAGA AAAACGATTA GACCCAACAA 4680 

CACGACATCG TAAAAATCAA TCTAATCATC GGCCGCAAAG TATTAATATG TAATAGTAGA 4740 

AGATGTTTAT AATTTAGCGA TTTCGTTTCA TGATTTATAA AAAATGAGGC TACTCAAGGA 4 800 

25 GCTCAAATAA TCTTTGAGTA GCCTTTTTAT AGGTTGTGTT TGTATGCGTT TACACTAAAA 4860 

TAGCAATTAT TATCATGAAA GTTTTTGGAT AAAAAGCGTT AATTATTGTA AAAATACTAA 4 920 

AAAATGAGAT GTTTTATTTA TAATTTTCTG CAAATTTATG ATATTGTTTC TTAATATATC 4 980 

30 ATATTAAAAA TTTGTTTTTC TTAAACATAG GAGGCTTATC TAATTCATGG ACACATCAAA 5040 

ACAATTTAGA GGTGACAACC GATTGCTTTT GGGTATCGTT TTAGGGGTTA TTACCTTTTG 5100 

GCTATTCGCG CAGTCACTTG TTAATCTTGT TGTCCCATTA CAATCAACAT ATAGTAGTGA 5160 

CGTTGGAACG ATAAATATCG CTGTTAGCTT ATCTGCCTTA TTTGCTGGTT TGTTTATCGT 5220 

AGGTGCTGGT GATGTTGCTG ATAAATTTGG TCGCGTCAAA ATTACTTATG TAGGATTGAT 5280 

ATTAAATGTT GTAGGTTCAT TACTCATCAT CATTACACCT TTGCCAGCAT TTTTAATTAT 5340 

AGGTAGAATA ATTCAAGGTT TGTCTGCAGC ATGTATTATG CCATCAACAC TTGCTATTAT 5400 

TAACGAATAT TATATTGGTA CAAGAAGACA ACGTGCCTTA AGCTATTGGT CTATTGGTTC 54 60 

TTGGGGTGGT AGTGGTATTT GTACGTTGTT TGGTGGCTTA ATGGCTACAT ATATAGGTTG 5520 

GCGTTCAATA TTTGTTGTTT CAATTCTATT AACATTATTA GCAATGTACT TAATCAAACA 5580 

TGCACCTGAG ACTAAAGCAG AACCAATCAA AGGTATGAAA GCAGAAGCTA AAAAGTTTGA 564 0 

CGTTATTGGT TTAGTCATTT TAGTAGTGAC GATGTTAAGT TTAAATGTAA TCATCACACA 5700 

GACGTCTCAT TTTGGTTTAG TTTCACCGTT AATTCTAGGT TTAATTGTTG TGTTTATCTG 5760 
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AATTTTTAAA AATAGAGGAT ACAGTGGTGC AACTATTTCA AACTTCTTAT TAAATGGTGT 5890 

AGCAGGTGGT GCACTTATCG TTATTAACAC GTATTATCAA CAACAATTAG GATTTAATTC 5940 

5 TTCGCAAACG GGTTATATTT CATTAACGTA TTTAATAACA GTGTTGTCAA TGATTCGTGT 6000 

AGGTGAAAAG ATTTTATCTC AACATGGTCC GAAGCGCCCA CTATTACTAG GAAGTGGCTT 6060 

TACAGTGATT GGGTTAATCT TATTGTCGTT AACATTTTTA CCAGAAGTGT GGTATATCAT 6120 

10 

ATCTAGTATA GTTGGATATT TATTGTTTGG TACTGGTTTA GGATTATATG CTACACCATC 6180 

AACTGATACA GCAGTTGCTA GTGCGCCAGA TGATAAGTCG GGTGTTGCTT CAGGTGTGTA 6240 

TAAAATGGCG TCATCATTAG GAAATGCATT TGGAGTAGCA GTATCTGGTA CGGTTTATAC 6300 

15 

TGTGTTAGCA GCTAATTTAA ATTTGAACTT AGGTGGTTTC ACAGGTATGA TGTTTAATGC 6360 

CTTGCTAGCA ATTGTTGCAT TTTTAGTCAT TTTACTATTA GTTCCTAAAA ATCAAACGAA 6420 

TTTGTAAAAC TGAAATGAAA GCAAGTTATT ATGTAGGGAT TTTAAAGGAA ATTTTGTGAA 64 80 

20 

AGTAAGTTTA TCATACACAC TTAATGTTGC GTATTGACGT TTAATGTTAG GTGTGTTCTT 654 0 

TTATAGACGA TAAAAGCTGT GTGCATATTA AGCGAATGAT TTTCAAATTG ACGCTAATAT 6600 

25 GCGAAAGTAG TATTTTTAAA ATGAACAACA ACGATGAAGA GGGGTTTATA GGATGAAAAT 6660 

TGCAATTGCT GGATCGGGTG CATTAGGTAG TGGCTTTGGT GCCAAACTAT TTCAAGCAGG 6720 

ATATGATGTC ACACTTATTG ACGGATATAC ATCTCATGTT GAAGCGGTTA AGCAACATGG 6780 

30 ATTAAATATA ACGATTAATG GAGAGGCATT CGAGTTAAAC ATTCCGATGT ATCATTTTAA 684 0 

TGATCAACCG GACGAAAGCA TTTACGATGT TGTCTTTCTA TTTCCAAAGT CTATGCAATT 6900 

AAAAGAAGTG ATGGAAGATA TGAAGCCACA TATTGATAAT GAAACGATCG TCGTATGTAC 6960 

55 GATGAATGGT CTGAAGCATG AAGAAGTCAT TGCGCAGTAT GTTGCTCAAT CACAAATTGT 7020 

CAGAGGTGTT ACGACTTGGA CGGCAGGTCT TGAAAGCCCT GGACACAGTC ATTTACTTGG 7080 

TAGTGGACCA GTTGAAATAG GTGAACTAGT GGATGAAGGT AAAGAAAATG TTATAAAAGT 714 0 

40 

TGCTGATTTA CTTAACGAAG CGGAATTGAA TGGTGTCATT AGTAAAGATT TATACCAATC 7200 

GATTTGGAAA AAGATTTGTG TTAATGGTAC GGCAAATGCA TTAAGCACAG TGTTGGAGTG 7260 

TAATATGGCA TCGCTGAATG AAAGTAGTTA TGCGAAGTGT TTGATTTATA AATTAACGCA 7320 

45 

AGAAATAGTG CATGTAGCGA CGATTGATAA TGTTCATTTA AATGTTGATG AAGTATTTGA 7380 

ATATTTAGTT GATTTAAATG AAaAAGTTGG TGCGCATTAT CCATCCATGT ATCAAGATTT 7440 

5Q AATTGTTAAT AATAGAAAAA CTGAAATTGA TTATATTAAT GGCGCAGTTG CAACATTAGG 7500 

TAAACAACGT CaTATTGAAG CGCCAGTCAA TCGCTTTATT ACTGATTTAA TTCATACTAA 7560 
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CAATCACGTG ATATTACGGT 


CATTATTAAG ATTGAAATGT AATAAATAAA GAACAGCAGT 


7680 


AAGGTACTTT CAAATTGAAA 


TGATCTTGGT GCTGTTTTTC 


TTGATTGATC 


TTCGTCATAA 


7740 


TTCAGATTTG TCATAGGcTA 


CGACATACTA TTAGTATTTA 


CTAGACAGTT 


TTTACGACGA 


7800 


CACTTTGAAA AATTTTGAGG 


CAAATCATTT GGAAGTCTCA 


CGTGAATTTT 


GTAAACTCAT 


7860 


CAAGCAAGTA ATTATATTAA AAAGACAAAT AGAGAAAAGG 


TGTTTATAAT 


GAGTAAAATT 


7920 


TTTGTAACTG GTGCAACGGG 


CCTTATTGGC ATTAAATTAG 


TTCAAAGACT 


AAAAGAAGAG 


7980 


GGGCATGAGG TTGCTGGTTT 


TACTACATCT GAGAATGGTC 


AACAAAAGCT 


AGCTGCTGTT 


8040 


AATGTAAAAG CATATATTGG 


TGATATATTA AAAGCTGATA 


CTATTGATCA 


AGCGTTAGCA 


8100 


GATTTTAAAC CAGAAATCAT 


TATCAATCAA ATTACGGATT 


TAAAAAATGT 


TGATATGGCA 


8160 


GCAAATACGA AAGTACGTAT 


TGAAGGTTCT AAAAACCTAA 


TTGATGCGGC 


GAAAAAGCAT 


8220 


GACGTTAAGA AAGTAATTGC 


CCAAAGTATT GCCTTTATGT 


ATGAACCTGG 


CGAAGGATTA 


^8280 


GCAAATGAGG AAACTTCACT 


TGATTTTAAC TCAACTGGCG 


ATAGAAAAGT 


AACGGTTGAT 


8340 


GGTGTGGTTG GTTTAGAAGA 


AGAAACGGCT CGTATGGATG 


AATACGTTGT 


TTTACGTTTT 


8400 


GGCTGGTTAT ATGGCCCAGG 


TACTTGGTAC GGAAAAGATG 


GCATGATTTA 


TAATCAATTT 


8460 


ATGGATGGTC AAGTGACACT 


TTCAGATGGC GTAACATCAT 


TTGTGCATCT 


TGATGATGCA 


8520 


GTTGAAACAT CTATTCAAGC 


TATTCATTTT GAAAATGGTA 


TCTATAATGT 


AGCAGATGAT 


8580 


GCACCTGTTA AAGGTTCTGA 


ATTTGCAGAA TGGTATAAAG 


AACAACTTGG 


TGTTGAACCA 


8640 


AATATTGATA TTCAACCTGC 


GCAACCATTT GAACGTGGCG 


TAAGCAATGA 


GAAGTTTAAA 


8700 


GCGCAAGGTG GTACTCTGAT 


TTATCAAACT TGGAAAGATG 


GCATGAATCC 


AATTAAATAA 


8760 


TAATTTATCC GTTTAATATA 


CAAAGAATAA AGACTTGGTC 


GAATCGTGGA 


TGATATATTA 


8820 


TCAAACGCAC GGCTCGAACA AGTCTTTTTT ATTATGTCTT 


CGTTATCTTT 


GTATGAAGGA 


8880 


ATAACAGAAT TACAATTAAT 


GTACTGAATA ATGCAATTAA 


TGTTGTGATT 


AGTGCTAATT 


8940 


TAATTTCTAT TGGTAGCCAA GTCAGTACAA AAGACCAATT ATTGCTACCG 


AGAATGAGAT 


9000 


ATGGTAATGC ATATAATATG 


AGCGCTAAAG CGATACATAT 


ACATAATGAT 


AACCAACTCA 


9060 


ATACAGCAAT CC 








9072 


(2) INFORMATION FOR SEQ ID NO: 46: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16826 base pairs 

(B) TYPE; nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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(xi.) • 


SEQUENCE DESCRIPTION: SEQ ID NO: 


46: . 








GTGGAACAGC 


TGTAACTATA TCATTTCTTT CAACATTTAT 


TGGGAAAATG 


TTAGCTACAT 


60 


5 


TTCTATATCC 


GATTAATAAT GTAGTACTTT CATATATnTC 


TGTAAATGAA 


AGTGACAATA 


120 




TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC 


TGCCCTATGT 


TTAGTCATGA 


180 




TTATATGTTA 


TCCAATTACA ATAATTATTG TCTCTTTACT 


GTATAACATT 


GATTCAAGTT 


240 


10 


TATATTCGAA 


GTTTATTATT TTAGGTAATA TAGGTGTTTT 


ATTCAATGCA 


GTGAGTATTA 


300 




TGATCCAAAC TTTAAATACA AAACACOCAT CAATAACATT ACAAGCGAAT 


. TATATGACGC 


360 


15 


TTCACACGAT 


TACATTTATA TTCATAACTA TTTTAATGAC 


AATTGCGTTT 


GGTCTAAATG 


420 


GATTCTTTTG 


GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATTTTA AATATTATAG 


4B0 




GTTTAAAGTC 


TAAATTCATT AATAAAAAGG ACGTCGATTA 


GATGAGTGAA AAAAAGATTT 


540 


20 


TGATTTTATG 


TCAGTATTTT TATCCGGAAT ATGTATCTTC 


TGCGACGTTA 


CCAACTCAAT 


600 




TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGTGTGGA TGGCCATATG 


660 




AATATAGTAA 


TCATAAACAG GTTTCTAAAA CCGAGATGCA 


TCGTGGTATT 


CGCATTCGAC 


720 


25 


GTCTCAAGTA 


TTCGAGGTTT AATAACAAAA GTAAGGTTGG 


AAGGATCATC 


AATTTCTTTA 


780 




GTTTATTTTC 


AAAATTCGTG ATTAATATAC CTAAAATGTT 


GAAATATGAT 


CAGATTCTTG 


840 




TTTACTCTAA 


TCCACCAATC TTGCCATTAA TACCAGACGT 


TTTACACAGA 


CTGCTTAAGA 


900 


30 


AAAAATATTC 


TTTTGTGGTG TATGATATAG CACCTGATAA 


TGCGATTAAG 


ACAGGTGCAA 


960 




CTCGTCCAGG 


TAGCATGATT GATAAGCTGA TGCGTTACAT 


TAATAGACAT 


GTCTACAAGA 


1020 




ATGCTGAAAA 


TGTCATTGTC CTTGGTACGG AAATGAAAAA 


CTACTTACTA AATCATCAAA 


1080 


35 


TTTCTAAAAA 


TGCTGACAAT ATCCATGTGA TTCCTAACTG 


GTATGACATG 


CGTCAATTAC 


1140 




AAG^CAATCG 


TATCTATAAT GACACATTTA AAGCTTACCG 


TGAGCAATAC 


GACAAAATTT 


1200 


40 


TATTGTATAG 


CGGTAATATG GGGCAGTTAC AGGATATGGA 


GACACTTATC 


TCATTTTTAA 


1260 


AATTAAATAA 


GGATCAGTCT CAAACGTTAA CAATACTTTG 


TGGTCATGGT 


AAGAAATTTG 


1320 




CAGATGTCAA 


AACGGCAATA GaAGACCATC GTATTGAAAA 


TGTTAAAATG 


TTTGAGTTTT 


1380 


45 


TAACAGGTAC 


AGACTATGCT GACGTATTAA AAATTGCGGA 


TGTATGTATT 


GCATCGCTGA 


1440 


TTAAAGAAGG 


CGTCGGTTTA GGCGTGCCGA GCAAGAATTA 


TGGCTATCTT 


GCAGCTAAGA 


1500 




AAGCGTTGGT 


ACTCATCATG GATAAGCAAT CTGATATCGT 


TCAACATGTT 


GAACAATATG 


1560 


50 


ATGCGGGTAT 


CCAAATTGAT AATGGCGATG CACATGCCAT 


TTATAACTTC 


ATCAACACTC 


1620 




ACTCGAGTAA 


GGAATTGCAC GAGATGGGTG AGCGCGCACA 


TCAACTGTTT 


AAAGATAAAT 


1680 
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AAGCGATTAT TCGATGTAGT GAGTTCAATA TATGGTTTAG TAGTTTTAAG TCCGATTCTG 1800 

TTAATTACAG CATTACTAAT TAAAATGGAa TCACCTGGAC CAGCCATTTT CAAACAAAAA 1860 

AGACCGACGA TTAATAATGA ATTGTTTAAT ATTTATAAGT TTAGATCAAT GAAAATAGAC 1920 

ACACCTAATG TTGCAACTGA TTTAATGGAT TCAACATCGT ATATAACAAA GACAGGGAAG 1980 

GTCATTCGTA AGACCTCTAT TGATGAATTG CCACAATTAT TGAATGTTTT AAAAGGAGAA 2040 

ATGTCAATTG TAGGTCCTAG ACCAGCGCTT TATAATCAAT ACGAATTAAT CGAAAAACGT 2100 

ACAAAAGCGA ACGTGCATAC GATTAGACCA GGTGTGACAG GACTAGCTCA AGTGATGGGG 2160 

AGAGATGATA TCACTGATGA TCAAAAAGTA GCGTATGATC ATTATTACTT AACACATCAA 2220 

TCTATGATGC TTGATATGTA TATCATATAT AAAACAATTA AAAATATCGT TACTTCAGAA 2280 

GGTGTGCATC ACTAATGAGA AAAAATATTT TAATTACAGG CGTACATGGA TATATCGGTA 234 0 

ATGCTTTAAA AGATAAGCTT ATTGAACAAG GACATCAAGT AGATCAAATT AATGTTAGGA 2400 

ATCAATTATG GAAGTCGACC TCGTTCAAAG ATTATGATGT TTTAATTCAT ACAGCAGCTT 24 SO 

TGGTTCACAA CAATTCACCT CAAGCAAGGC TATCTGATTA TATGCAAGTG AATATGTTGC 2520 

25 TGACGAAACA ATTGGCACAA AAGGCTAAAG CTGAAGACGT TAAACAATTT ATTTTTATGA 2580 

GTACTATGGC AGTTTATGGA AAAGAAGGTC ATGTTGGTAA ATCAGATCAA GTTGATACAC 2640 

AAACACCAAT GAACCCTACG ACCAACTATG GTATTTCCAA AAAGTTCGCT GAACAAGCAT 2700 

30 TACAAGAATT GATTAGTGAT TCGTTTAAAG TAGCAATTGT GAGACCACCA ATGATTTATG 2760 

GTGCACATTG CCCAGGAAAT TTCCAACGGT TAATGCAATT GTCAAAGCGA TTGCCAATCA 2820 

TTCCCAATAT TAACAATCAG CGCAGTGCAT TATATATTAA ACATCTGACA GCATTTATTG 2880 

ATCAATTAAT ATCATTAGAA GTGACAGGTG TGTACCATCC TCAAGATAGT TTTTACTTTG 2940 

ATACATCGTC AGTAATGTAT GAAATACGTC GCCAATCACA TCGTAAAACG GTATTGATCA 3000 

ACATGCCTTC AATGCTAAAT AAGTATTTTA ATAAGTTGTC GGTCTTTAGA AAATTATTCG 3060 

GCAATTTAAT ATACAGCAAT ACGTTATATG AAAATAATAA TGCACTTGAA ATTATTCCTG 3120 

GAAAAATGTC ACTTGTTATT GCGGACATCA TGGATGAAAC GACAACCAAA GATAAGGCAT 3180 

AAGTCATCTA TTAAATAAAA TCAACATACA AATCGTTTTA TTTGGAGGTT ATAGTATGAA 3240 

GTTAACAGTA GTTGGCTTAG GTTATATTGG TTTACCAACA TCAATTATGT TTGCAAAACA 3300 

TGGcGTCGAT GTGCTTGGTG TTGATATTAA TCAGCAAACG ATTGATAAGT TACAAAGTGG 3360 

TCAAATTAGT ATTGAAGAAC CTGGATTACA AGAGGTTTAT GAAGAGGTAC TGTCATCGGG 3420 

AAAATTGAAG GTATCTACAA CGCCAGATGC ATCTGATGTT TTTATCATTG CCGTTCCGAC 3480 
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TAGTATTTTA TCATTTTTAG AAAAAGGAAA TACCATTATT GTAGAGTCGA CAATTGCGCC 3600 

TAAAACGATG GATGATTTTG TAAAACCAGT CATTGAAAAT TTAGGGTTTA CAATAGGTGA 3660 

AGATATTTAT TTAGTGCATT GTCCAGAACG TGTACTGCCA GGAAAAATTT TAGAAGAATT 3720 

AGTTCATAAC AATCGTATCA TTGGCGGTGT GACTGAAGCT TGTATTGAAG CGGGTAAACG 3780 

TGTCTATCGC ACATTCGTTC AGGGAGAAAT GATTGAAACA GATGCACGTA CTGCTGAAAT 3840 

GAGTAAGCTA ATGGAAAACA CATATAGAGA CGTGAACATT GCTTTAGCTA ATGAATTAAC 3900 

AAAAATTTGC AATAACTTAA ATATTAATGT ATTAGATGTG ATTGAAATGG CAAACAAACA 3960 

TCCGCGTGTT AACATCCATC AGCCTGGTCC AGGTGTAGGC GGTCATTGTT TAGCTGTTGA 4020 

TCCGTACTTT ATTATTGCTA AAGACCCTGA AAATGCAAAG TTAATTCAAA CTGGACGTGA 4080 

AATTAATAAT TCAATGCCGG CCTATGTTGT TGATACAACG AAGCAAATCA TCAAAGTGTT 414 0 

GAGCGGGAAT AAAGTCACAG TATTTGGTTT AACTTATAAA GGTGATGTTG ATGATATAAG 4200 

AGAATCACCA GCATTTGATA TTTATGAGCT ATTAAATCAA GAACCAGACA TAGAAGTATG 4260 

TGCTTATGAT CCACATGTTG AATTAGATTT TGTGGAACAT GATATGTCAC ATGCTGTCAA 4320 

25 AGACGCATCG CTAGTATTGA TTTTAAGTGA CCACTCAGAA TTTAAAAATT TATCGGACAG 43 BO 

TCATTTTGAT AAAATGAAGC ATAAAGTGAT TTTTGATACA AAAAATGTTG TGAAATCATC 4440 

ATTTGAAGAT GTATCGTATT ATAATTATGG CAATATATTT AATTTTATCG ACAAATAAAA 4500 

30 TGTGTCAAAC TAGGGCATAC ATGATTAAGG AAAGATAAGC TGTCATGTGT TTGAACTTCA 4560 

GAGAGGATAA TGTTATGAAA AAAATTATGG TTATTTTCGG TACGAGACCC GAAGCAATAA 4 620 

AAATGGCACC ATTAGTAAAA GAAATTGATC ATAATGGGAA CTTTGAAGCG AACATTGTGA 4680. 

TTACAGCACA ACATAGAGAT ATGTTAGATA GTGTGTTAAG TATATTTGAT ATTCAAGCTG 4740 

ATCAiTGATTT AAATATTATG CAAGATCAAC AAACATTAGC AGGCCTTACG GCGAATGCAC 4 800 

TTGCTAAACT TG AT AG CATC ATTAATGAGG AACAACCGGA TATGATTTTA GTACATGGTG 4 860 

ATACTACAAC GACTTTTGTA GGAAGTTTGG CAGCATTTTA TCATCAAATT CCGGTCGGAC 4 920 

ATGTAGAAGC TGGACTTCGA ACACATCAGA AATACTCACC ATTTCCTGAA GAGTTAAATC 4 980 

GAGTCATGGT AAGTAATATT GCTGAATTGA ATTTTGCGCC AACAGTAATT GCAGCTAAAA 504 0 

ATTTACTTTT TGAAAACAAA GACAAAGAGC GTATCTTTAT TACTGGAAAT ACAGTTATTG 5100 

ACGCATTGTC AACAACAGTT CAAAATGATT TTGTTTCAAC GATTATTAAT AAACATAAAG 5160 

50 GCAAGAAAGT TGTTTTACTA ACAGCGCATC GTCGTGAAAA TATTGGGGAA CCGATGCATC 5220 

AGATTTTTAA AGCAGTAAGA GATTTGGCAG ATGAATATAA AGATGTTGTC TTCATTTATC 5280 
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GGATTGAATT AATTGAGCCA TTAGATGCGA TTGAGTTCCA TAATTTTACA AATCAATCGT 5400 

ACCTCGTGCT GACAGATTCT GGTGGTATTC AAGAGGAGGC TCCTACATTT GGAAAACCTG 5460 

TGTTGGTATT AAGGAATCAT ACAGAGCGTC CCGAAGGCGT TGAGGCGGGA ACATCGAGAG 5520 

TAATTGGCAC AGATTATGAC AATATTGTTC GAAATGTGAA ACAATTGATT GAGGATGATG 5580 

AAGCGTATCA ACGTATGAGT CAAGCGAATA ATCCATATGG TGATGGACAA GCATCACGAC 5640 

GTATTTGTGA AGCAATAGAA TATTATTTTG GATTGCGCAC AGACAAGCCG GATGAATTCG 5700 

TACCTTTACG TCACAAATAA TAAAAAACCC CTAATCATGA AGTTGGTTTA GACAACCAGC 5760 

GGTGACTAGG GGTTTTTAAT ATATTTATTT TTGATAGTGG TAGCCAATAT CATATTTGAA 5820 

TACTTTATTT GATAATATTG GACTTTGCTG TCCATCGTCA TCACTTTTTA AACGTACATT 5880 

TTTATGAGCT TCTTTAAATA CATCGGAATT CAACCAATTA TTAAAGCTAT CTTCAGATTC 5940 

CCAAATAGTT AAGATTTTAA CTTCGTCTGT ATCCTCGGTA TTTAATGTTT TAGTGACAAA 6000 

CATTTGTTGG AAGCCTTCAA TAGTTTCAAT ACCTTGTCTA TTGTAAAAAC GTTCAATCGT 6060 

TTCTTCCGCA CTGCCTTTTT GTAATTGTAA TCTATTTTCT GCCATAAACA TGGGCAATCA 6120 

25 CTCCTCTATT TTATGATTTG ATTTGGGTAA TGTTTTTACA AATGTAAAGA GTACAGCGGT 6180 

TTGTATGATA ACCATTATGA TTAATCCTAC ACGGACTGCA AGAACATCCA CCATATAAAT 6240 

TGAAAAACCT ATTACAATGT ATAAGCTAAT TAAAATTTTA ATTTTCTGTT GTAGCGTGTA 6300 

30 GCCTCGATGT AAATAAAAGT TTTCTACATA TTCTTTATAA ATTTTTTGAT TAATAAGCCA 6360 

ATTGTAAAAG CGATCTGAAC TTCGAGCAAA GCAAAAAACT GCTACGAGTA AAAAAGGGGT 6420 

CGTTGGCAGT AAAGGTAATA CGGCACCTGC AATACCAAGC GCTGTAAATA TTAAGCCAAT 6480 

GACGATTAAA ATAAGTCGCA TTGAAAAAAC TCCATTCTAG TACTAATGCG CATGTAATAT 6540 

TGTTTTAGTA ATATAACTCA TGCTAAATAT AATGTGTATG ATAAGTGCAA TGACTCAGTA 6600 

AAATGAAACG ATGTTGAATT ATCCTTGTCA CATTAACGCA TTTTAAGCGC GACTTTCATA 6660 

ACAACCAAAC TATTTAATGA GAATTATTCT CAAGTATTAT AGTTATATTA TGTGTTTTAT 6720 

TTTTGAAAAG TGCAATATGT TTTCGAAAAT AAGATTATTT TTATGTGCAA AAACGACGCA 6780 

AAAGTTTTAA AAATGAGACT TCTGTGAGCT GATTATTTTA TAAAATGTAA ACGCTTACTA 684 0 

TATAATGTGA ATCATATCGT TTAAAAGCAT TATTAAATAT GATGCTAAGA GATTTATATT 6900 

ATAGCCAATA AACAAAGGAG AGATAATATG GCAGTAAACG TTCGAGATTA TATTGCAGAG 6960 

AATTATGGTT TATTTATCAA TGGGGAATTT GTTAAAGGTA GCAGTGACGA AACAATCGAA 7020 

GTGACTAATC CAGCAACTGG AGAAACACTA TCACATATTA GAAGAGCAAA AGATAAAGAT 7080 
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TCAGAACGTG CACAAATGTT GCGTGATATT GGTGATAAAT TAATGGCACA AAAAGATAAA 7200 

ATTGCAATGA TTGAAACATT AAATAATGGT AAACCGATTC GTGAGACAAC AGCAATTGAT 7260 

5 

ATTCCATTTG CTGCAAGACA TTTCCATTAT TTCGCAAGTG TTATTGAAAC AGAAGAAGGT 7320 

ACAGTGAATG ATATCGATAA AGACACAATG AGTATCGTAC GACATGAGCC GATTGGCGTC 7380 

GTAGGTGCTG TTGTTGCTTG GAACTTCCCA ATGCTATTAG CTGCATGGAA GATTGCGCCA 7440 

10 

gCCATTGCTG CAGGTAATAC AATTGTGATT CAACCTTCGT CTTCAACACC ATTAAGTTTA 7500 

< 

TTGGAAGTTG CTAAAATTTT CCAAGAGGTA TTACCTAAAG GTGTTGTCAA TATACTAACG 7560 

GGTAAAGGTT CAGAATCAGG TAATGCAATT TTCAATCATG ATGGTGTAGA TAAATTATCA 7620 

15 

TTTACGGGCT CAACTGATGT AGGTTATCAA GTTGCCGAAG CTGCAGCAAA ACATCTAGTA 7680 

CCCGCTACAT TAGAGCTTGG TGGTAAAAGC GCCAATATCA TATTAGATGA TGCTAATTTA 7740 

20 GACCTTGCAG TTGAAGGTAT TCAGTTAGGT ATTTTATTCA ACCAAGGTGA AGTATGTAGT 7800 

GCAGGTTCTC GATTATTAGT TCATGAAAAA ATTTATGATC AATTGGTGCC ACGTTTACAA 7860 

GAGGCATTTT CAAATATTAA AGTTGGAAAT CCACAAGATG AAGCTACACA AATGGGTAGT 7920 

25 CAAACTGGTA AGGATCAATT AGATAAAATT CAATCATATA TTGATGCAGC AAAAGAATCA 7980 

GATGCACAAA TTTTAGCAGG CGGTCATCGC TTAACTGAAA ATGGATTAGA TAAAGGGTTC 8040 

TTCTTTGAGC CGACATTAAT TGctGTGCCA GACAATCATC ACAAATTAGC ACAAGAAGAA 8100 

30 ATATTTGGAC CAGTGTTAAC AGTGATTAAA GTGAAGGACG ATCAAGAAGC AATTGATATA 8160 

GCTAATGATT CTGAGTATGG TTTAGCAGGC GGTGTATTTT CTCAAAATAT CACACGTGCA 8220 

TTAAATATTG CTAAAGCTGT ACGTACAGGA CGTATTTGGA TTAACACTTA CAACCAAGTA 8280 

35 

CCAGAAGGCG CACCATTTGG TGGTTATAAA AAATCAGGTA TCGGTCGAGA AACTTATAAA 834 0 

GGTGCGTTAA GTAACTATCA ACAAGTTAAA AATATTTATA TTGATACAAG CAATGCTTTA 8400 

AAAGGTTTGT ACTAGAATAA ATATCGTTTC TGAAGCGTGT TTGTAGGTCA GTCTAGCGGT 84 60 

40 

AAGTCTTAAC ATTTAACGGC GTTGTTTAGA TTTTAAGCAA AACAAAATAT ATAGGAACAC 8520 

GTATCATGAT ATTAGGATAT AATGACTAAA ATAATAGCAG TAGGATGGTT TTTAATTGCA 8580 

AATCATCTTA CTGCTGTTTT TAATTATGCT AATTTGCGAT GCGGCTATTA TAAGGACAGA 8640 

45 

GTTGTTTATT AATTATGGTG ATTTAGAAAT ATGAAGTTCA ATATGCAAAG TCATCGTTTG 8700 

TTTTAATATG CGGAACAATC ATTAAAGTTA TTGCGATTTT TTGAACTTAA TGAAACTAAA 8760 

50 CAATAAATTT GAGATACTTT TTTGTCATTT TTATGTAACT AACACAATAA TCTCGTACAT 8820 

TATTAAAATT TTCTATATGA TAGGAATAAA GCAAAGCGCG AGTGTGCTGT AAAAGTTTTC 8880 
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GATGATGTAT AAATCATGGT TAATTACGGA 
GAATTATTTT TAAAAGCGAC AATATTAAAT 

5 

ATGAATGGGA AAAAGGCGAA TACGATAAAC 
CAAAAAATTC AACAAAGTTC TAAAAAGACG 
TTTACAGTGA TTGAATTTGT CGGAGGTTTA 

10 

TCATTTCATA TGCTTAGTGA TGTATTAGCA 
GCAAGTAAAA AGCCGACTGC ACGATACACA 
GCATTTTTAA ATGGTTTAGC ATTAATTGTA 

15 

GTACGTATTA TTTATCCGCA ACCAATTGAA 
GGTTTACTCG TCAATATTAT TTTGACTGTT 

2Q AATATCAATA TTCAAAGTGC ATTATGGCAT 

GTCATCGTTG CAGTTGTATT GATTTACTTT 
AGTATTGTAA TTTCACTCAT CATTTTACGT 

25 tTAATTTTAA TGGAAAGTGT GCCTCAACAT 

AAAAACATAG ATGGCATATT AGATGTACAT 
CATTATTCAT TAAGTGCCCA TGTTGTGTTA 

30 GCGATTGATC AAGTATCATC ATTGTTGAAA 

CAAATTGAAA ACTTGCAATT GAATCCATTA 
ATAAAACATT GTAGCGCCTA AAACATTAAT 

35 CTTATGTTGC ATCATTTAAA TGATTTTCGT 

CGACATCTTT AGGTTTCAAA ATATGAATAT 
CTATGATGTA CCTTTGACCG GCCATTGTTT 

40 

TTGCTACGAC AGATTCTTTA TCCATAATGA 
TACCCTAACA TGATTTTTAT ACTCTTTGAA 
TTAAAAAAAT ATCTTAATAT CCTTGTAATC 

45 

CATtGTTATA GGAGGTCTTA TTAATGACAT 
TTGCATCAAC GAAAGAAGAA CTAGAAGCAA 
CAACATTAAT TGAAGTACAA GCTACTGAAA 

50 

CAAATGACGA aGCAGAAGCT AAACAATTTT 



AGCATTAATA TTAACCTGAG AAGCTATAAA ' 9000 

ACGACGCATT TATTTAGGAG TGGCAAACGT 9060 

AGATACAAAT ATTTTCATCA TGTCAATCAT 9120 

CTGTGGGCAT CACTAATCAT CACATTGTTA 9180 

GTATCTAATt CATTGGCATT ACTGTCAGAT 9240 

CTTGGTTTAT CTATGTTGGC CATTTATTTT 9300 

TTTGGATATT TAAGATTTGA GATATTAGCT 9360 

ATTTCAATCT GGATTTTATA TGAAGCTATT 9420 

AGTGGCATTA TGTTTATGAT TGCTAGTATT 94 80 

ATCCTTGTAA GGTCTTTAAA ACAAGAAGAC 9540 

TTCATGGGAG ACTTATTGAA CTCTATTGGT 9600 

ACAGGATGGC GCATCATCGA CCCAATCATT 9660 

GGTGGTTATA AAATTACGCG TAATGCgTGG 9720 

TTGGATACTG ATCAAATTAT GGCAGATATT 9780 

GAATTTCATT TGTGGAGTAT TACAACAGAG 9840 

GATAAAAAAT ATGAGGGTGA TGATTATCAA 9900 

GAAAAATATG GCATTGCACA TTCAACGTTG 9960 

GATGAGCCAT ACTTCGACAA ATTAACATAA 10020 

CTATGTCATA GGCGCACGTT TCGTTTTATA 10080 

CAATTTCTTT GATGCTATCT ACATCTAACA 10140 

GTTTTTCATC ATTTGTATGT AAAATGCGTT 10200 

CTACAGCAAT CTTTTTGTTT CTAGCTAAAC 10260 

TAGCCCCCTA . TATATATGTT TATTTACTTA 1 0 3 2 0 

AATATATTTT ACAGAATTTT ATCTAAATAT 10380 

CGATAAGAAT TATAGTAATA TTTTTTCAAC 10440 

TATTTTTATT AGAAGCTAAC AATCTTGATT 10500 

AGGCAGCATC ACTATCTACG AAGACAATTC 10560 

ATTTAACTCA TGGTTATTTT ATTGTGGAAG 10620 

TAACAGAAGC AGATATTAGT ATTCAATTAG 10680 
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TTGATTACCT 
GTAAAAAGAA 
ATGTATGTGA 
TACGTCGCGC 
ACAACAAGTT 
AGAATAGCTC 
AAGTATTGTC 
CGTTAATTGA 
ATATAACAAT 
TCCTATTAAA 
ACATCACTTT 
AATAGTCACT 
TGGATTTATT 
ATCTCCAGAT 
TGACAACATT 
TAAATTAGTT 
GAAACAACGT 
TGAGCCATTA 
aCTAAAACAT 
TTATCTTTCC 
AATTACAGCA 
AATTATGGAA 
AGTGAGTGAC 
GATGTGATTG 
TTAAAATTGG 
TATCACAATA 
ATTTAATGGA 
CGATGAAATC 
GCAATGTCAT 



TGTAACTTGG 
AAATTCTGTT 
AGATATGTCT 
GCGCAAAGCA 
GATGAGATAT 
AATGCTATAA 
TAAAACAATC 
ATAACGCTTA 
TCACGATATA 
ATAGTAGGGA 
GGATCACATA 
TTCATAGGGA 
CATCCATCGT 
TGTTTAATGC 
AGGATTGGAT 
GATTTAGAAG 
GTGGCACTAT 
GGTGCATTAG 
AAAACGCAAT 
GACCGCATTG 
TCACATCCAC 
ACATTTGCAT 
GATGAAAAGG 
GCAAAGGACG 
ATATTTGCCG 
CAATCATCCG 
CGCATTAAAC 
AAAACAGAAG 
TATGGGACAA 



AACATTCCGG 
CATTATGAAG 
AAATGTATTT 
GTTGATACAC 
ATGTATATAG 
ATGTAAGTAG 
ATTTTATTGA 
TGTTATAAGA 
AGGGCTGTGT 
TTAAAAGGGG 
AAGTAATTCA 
AAAGTGGTTG 
CTGGTCGTGT 
TATTTCAACA 
TACAACAGAA 
ACAGGGGAAA 
GTCGAGCGCA 
ATGCATTTAC 
CAACTATTAT 
TTCTGTTAGG 
GCAGTCGTAA 
TGAATCATCA 
TTAAGCATAA 
TCTAAAGAAC 
ATTACACATT 
AAATATAAAC 
AGTGGTCGTA 
GGCTCAAATA 
AAAGGTATGC 



AAGGCATTAC 
AAGTGCCAGA 
GTTTATACAA 
CGATTGATGG 
GTTTGGCATG 
TTGATATGAA 
AATTTAGTAG 
GCACTCATAC 
TTGGCATAGC 
GCTTGTCATG 
TAACTTTAAT 
CGGAAAGTCT 
CATTATTGAT 
TCATAATTTG 
AATTAGTGAT 
GCATTTTCCC 
TGTGCATAAG 
ACGTTATAAA 
TTTAGTGACG 
TGAAGGGTGC 
TGATAGCCAC 
TCAAGTTGAA 
TCGTCATCAT 
GGTCTAAAAA 
CAGCTAATTT 
TAGAATTAGT 
TTGATGGTGC 
TAAAGGCTGT 
ACTTAAATGA 



GATGGATCAA 
AGTTGAATTT 
CGCACCTGAT 
CATCGAAAAA 
GATTTCGATT 
GAAACTAATG 
AGCTGAAATT 
CAAACCATAA 
CCTTTAGATA 
ATTAAAATTC 
TTGGACATTA 
ACTTTACTCA 
AACGAAATTA 
CTGCCATGGA 
GAAGAGATTA 
GAGCAACTGT 
CCTAACGTTA 
CTTCAGGATC 
CATGACATTG 
AATATTATTT 
CTACTTAAGA 
CCTGAATATT 
TGGAATCTTT 
TGCCCAAAAT 
GATGATGACT 
TAAATTCAAT 
ATCAACTTTA 
GGCATTGGGC 
ATTTAATAAT 



TATTTAGCAC 
AAACGCACAT 
GAAGAAGCGG 
CTTTAATAAG 
GCAGTTAATT 
AACTAAATGC 
AATATAACGT 
TCATCTATAG 
TACACTTAAT 
AACAATTACA 
GCAAGGGAGA 
ATATTATCGG 
AACAACAGCC 
AAACGATTAA 
ACGCACAGCT 
CCGGGGGTAT 
TATTGATGGA 
AACTAGTGCA 
ATGAAGCTAT 
CTCAATATGA 
TTCGTAATGA 
ATTTATAAGG 
ATAATTACAG 
CAGCAAGTGA 
AAAAAATTAT 
AATTGGCCAG 
ATAGAGCTAG 
CATCATGAAG 
AATGGCGATG 



10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11-760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
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GTAAACAATT AAAGATTAAA CCGGGGCATT TTAGCTATCA TGAAATGTCG CCAGCAGAAA 12600 

TGCCAGCCGC ATTGAGTGAA CACAGAATTA CAGGGTATTC TGTAGCCGAA CCATTCGGTG 12660 

5 CACTGGGTGA AAAGTTAGGC AAAGGTAAGA CTTTGAAACA TGGTGATGAC GTTATACCTG 12720 

ATGCGTATTG CTGTGTGCTA GTACTGAGAG GGGAATTGCT TGATCAACAC AAGGATGTAG 12780 

CGCAAgCATT TGTACAAGAT TATAAAAAGT CTGGCTTTAA AATGAATGAT CGCAAGCAAA 1284 0 

10 

GTGTAGACAT TATGACGCAT CATTTTAAAC AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 12900 

CATGGACATC CTATGGTGAT TTAACAATTA AGCCATCCGG CTATCAAGAA ATTACGACAT 12960 

TGGTAAAACA ACATCATTTG TTTAATCCAC CTGCATATGA TGACTTTGTT GAACCGTCAT 13020 

15 

TGTATAAGGA GGCATCGCGT TCATGACACG TCCCACAAAT AACAAATTTA TATTACCTAT 13080 

TATCACATTT ATTATTTTCT TAGGCATTTG GGAAATGGTC ATTATTATTG GGCATTACCA 13140 

2Q ACCTGTATTG TTACCGGGTC CTGCTCTTGT AGGAAAAAGT ATATGGTCTT TCATTGTTAC 13200 

TGGAGAAATT TTCCAACATT TAGCAATTAG TTTATGGAGA TTTGTAGCGG GCTTTGTTGT 13260 

CGCATTGTTG GTTGCTATTC CATTGGGCTT CTTGCTTGGA AGGAATCGTT GGCTATACAA 13320 

2S CGCTATCGAA CCGCTATTTC AATTGATTAG GCCGATATCT CCGATAGCAT GGG CACCATT 13380 

TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT GCCAGCGATT GCGATTATTT TTATCGCTGC 13440 

TTTTTTCCCA ATTGTGTTCA ATACTATTAA AGGCGTTAGA GACATTGAAC CTCAATATTT 13500 

30 AAAAATAGCA GCAAATTTAA ATTTAACTGG GTGGTCATTG TATCGCAATA TATTATTTCC 13560 

CGGGGCATTT AAACAAATCA TGGCTGGGAT ACATATGGCG GTAGGAACAA GTTGGATATT 13620 

TTTAGTTTCT GGTGAAATGA TTGGTGCACA ATCGGGATTA GGTTTTTTAA TCGTTGATGC 13680 

55 ACGAAATATG TTGAACTTAG AAGATGTTTT AGCAGCAATA TTCTTTATCG GATTATTTGG 13740 

TTTXATTATT GATCGATTCA TTAGTTATAT TGAGCAGTTT ATACTTAGAA GATTTGGTGA 13800 

ATAAGGAGAG ATGATGATGA CTTTAGAAAC GCTTATCAAA GAACAATTAG ATCCTCATTT 13860 

40 

AGTAGAAGTT GATGAAGGGA CGTATTATCC GAGAACATTT ATTCAGCAAT TATTTGTAGA 13920 

TGGTTATTTC GGTGAGGCGG CATTGAGAAA AAATGCTGAA GTAATCGAAG CTGTATCGCA 13980 

GTCTTGTTTG ACAACAGGAT TTTGTTTATG GTGCCAATTA GCTTTTTCAA CGTATTTAGA 14040 

45 

AAATGCCACG CAGCCACATT TAAATAATGA CTTACAACAG CAATTGTTAT CTGGAGAAAT 14100 

ATTAGGTGCT ACCGGATTGT CTAATCCGAT GAAGTCATTT AATGATTTAG AAAAGTTGAA 14160 

CCTTGAACAC ACTTATGTTG ATGGACAATT GGTTGTCAGT GGACGTATGC CAGCTGTAAG 14220 

$0 

TAATATTCAA GAAGACCATT ATTTTGGTGC GATTTCGAAA CATGAATCAT CAGATGAATT 14280 
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TTTAGGAGTC 
ATCACAAATT 
TGCTTACCAA 
ATTTTCAAAT 
AAAACGTTAT 
TTCACATTTA 
AAATCAAGCT 
TCGCAAGTTA 
TAAACTTGAA 
TTGTTGAAAC 
GTTATATCCT 
AAAAGTGTTA 
GAAGGTGAAT 
TTATCGTAAA 
TTGGGCGATT 
CAAACGCAAT 
AGAAATAGAT 
AGATATTGAT 
ATAAGTGTTG 
ATAATTCAAG 
CAAdATGCAA 
AAATCGTAGC 
AAAAAGCATT 
CAGATAATGG 
CATTTTATCC 
CAATTACAGC 
TTGGTGTCGT 
ATTTATTAAT 
GGAACTTGTC 



AACGGGTCAG 
ATCACGCATG 
ATTCCAATAG 
GTGCAAAACG 
CGTCAACTTA 
AATGAATTAA 
TCTGTTGTCA 
AAAGAAGGAT 
GCAGAGTTGA 
ATTTTTTAAA 
TTTAACTAGG 
ATAAGGTGTA 
ACTATAGATA 
GAATACGTGG 
GCACATATGA 
GAATCGATGC 
GTAAATGAGA 
TTGAATGATG 
CTGGTGTAAG 
GGGGTGGTAT 
CACGTACTTT 
ATTATTTCCA 
AGGATTAAAA 
TGAAGACTTA 
TGCATATATG 
AGGTGTAGGA 
TGAAGTTACA 
ACTTCTTAGA 
TCAAGTAGGT 



CAACGTATCA 
ATGCGAAGCA 
GATTAGGCTC 
GAATAAATCA 
GAGAGGAATA 
TATCATTGAA 
ATGGTGGTTC 
TCTTCTTCGC 
AGGGGTAAGT 
ATAATATAAA 
AAAATATACA 
TAATGAAAAT 
CGCATACTAA 
GTAAAGGACC 
CAGGTGTTTT 
TCCATTATAC 
TGGAAAGTCT 
ATGAAGTCAT 
GTACACGGTG 
GTCAAACGGT 
AAGGAAGTCA 
GAAGCAGTAG 
ACATTTTTAG 
GATAAACATT 
ACTCGTGAAC 
TCTGACCATG 
GGAAGTAATA 
AACTATGAAG 
AATCATGCGC 



AATCACATTG 
GTTTGCGGCA 
AATTAAAAGT 
ATATTTAGAG 
TTATGCAATA 
GAAGGACATC 
TAGAGCGTAC 
AGCATTGACA 
GTGATAAGCT 
TCTTAGTTTA 
TTTCGTAATA 
GTGAACAATT 
AGAACAACAA 
CAATAGTATT 
GAGTAAAGTT 
ACGCACAGAG 
TGTAGGCGCT 
TTCAATATTT 
CTGTTTGCTA 
GCCGTTTTTT 
AAATTTATCA 
AAGGTCAAGA 
AGGAAAGAGG 
TACCAGATAT 
GTATTGAAAA 
TAGATTTAGC 
CAGTTAGTGT 
AAGGTCATCG 
ATGAATTACA 



AATCAAGTCG 
ACTATTCGCC 
TCTTTAGAGT 
TATGATGTTG 
TTAGATGACG 
GGCTATTTAT 
ACACCATATT 
CCGACATTAA 
GATTTTTTGT 
TAAACATTTT 
ATAATAATCG 
AATGAACTTC 
TTCTCGAATC 
CGAGTGTCGT 
GAGAGTTTTT 
AAGATTAAAC 
AAGTTTGTAA 
GTTTTCGATA 
ACTTCGCTTT 
TGTCATATTT 
TTTAGGAGAG 
AAATCAATTA 
ACATGAGTTC 
GGATGTGATT 
AGCACCGAAC 
GGCAGCAAGT 
GGCAGAACAT 
TCAATCAGTA 
ACACAAAACA 



TAGTGCCACA 
CGCAATTTAT 
TAATTGATGC 
AAGCTTTTAA 
GTAACTTAAC 
TGTTAGATGT 
CGCCACAAGT 
GACATTTAGG 
TTAGATGCGT 
CTGTTAATTT 
TTATCATTGA 
TTATTTTAAA 
TAGTAAGATC 
TTAAAGATAA 
ACCTAAACGA 
AGATGTATAA 
AATTATTTAC 
AGTCAATAGA 
GAATTTAACA 
TTAAAACAAG 
ATGGATATGA 
CTTAATACTA 
ATTATATTAG 
ATTAGTGCGC 
TTGAAATTAG 
GAACACAATA 
GCGGTTATGG 
GAAGGTGAAT 
ATTGGTATTT 



14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
157B0 
15840 
15900 
15960 
16020 
16080 
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TACAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 16200 

ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 16260 

5 ATAACTTATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAGTTAT TTAGTGAATA 16320 

CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 16380 

TACAAGGATA TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 1644 0 

10 

GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 16500 

AACGTATTGA AGATGGAGTT AAAGATATTT TAGAGCGTTT CTTCAATCAT GAACCTTTCC 16560 

AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 16620 

15 

AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AATCATATAA 16680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 16740 

CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT ACGCAATTAT 16800 

20 

TGAnAAATnT CATTCATGTG GnAATC 16826 
(2) INFORMATION FOR SEQ ID NO: 47: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

TT CAATG AG A GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATTGGCCATA 60 

ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 120 

TAT^AACTA TTGAAAAATT CTTGTGATAG CATAGTGACA TCTCCTAAGA CAAAATAGTT 180 

AGCTTAGCTA mCCTTTTTAC AACAATAGTA ATTATAAAAC GGGAGCAATT AGAAATCAAT 240 

ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 300 

ATAGGGGGAT ACTAATGATA TTGAAATTTG aTCACATCAT TCATTATATA GATCAGTTAG 360 

ATCGGTTTAG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGGTAT CATCATAAAT 420 

ATGGAACATT CAATAAATTA GGTTATATCA ATGAAAATTA TATTGAGCTA CTAGATGTAG 480 

AAAATAATGA AAAGTTGAAA AAGATGGCAA AAACGATAGA mGGCGGAGTC GCTTTTGCTA 540 

CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 600 

ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCCGATTC 660 
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ATCAGGATGA TGATGAAATT AAGCCACCAT 
TGCGTACTAA AAAATTGCAA AAATATTTTC 
5 TGAAAAGTAA AAACCGATCA CAAACAGTAT 

TTGTAGAAGA GAATGACCAT TACACAGATT 
GAATTGAAGA TGGTAAAGTT TCAAAATATC 

10 

CTTCACCATA TTCAA TT TTT ATCAGAGGTG 
TATACGTAAG TGCTATGAGC GAGAATGCCC 
ATCGTTAATA TATTATTTAA TCGTGATGAC 

75 

AATGTGAAAA AGATAAGTAT AACCCGTAAA 
AATGTCATAA TGATTGCAAC GATGTTCATA 
ATATCAAACA CCTCATTGTT AGATTATTGA 

20 

TTAATGTGGT TGCTTGAGGA AAAATTTATT 
TGAATATCGT GTTAGATGAT GAAAGTATAT 

25 AATTGTACGA TAACATTAAA TTTAACACGA 

ATGGGTAAAT TTGAACTTGC TAAACTATTA 
TTCAAATCTT ACACAAGCTC TGAATCGACA 

30 ATTGTTAAAT AGAAGGAGAT ATCATAAATC 

CATGACGCAT TTGTTAAATC CCACCCAAAT 
GAAACAAAGA AATTAACTGG ATGGTACGCG 

35 GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA 

TATATTTCGC GTGGTTTTGT TGTTGATTAT 
GACAGTGCAA AAGAAATTGC TAAAGCTGAG 

40 GTTGAAGTTG ATAAAGGTAC AGATGCTTTG 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC 
CCAATTGATA AAAATGATGA TGAGTTATTA 

45 

GTGCGCTTGG CTTTAAAGCG AGGTACGACA 
ACATTTGCTG AGTTAATGAA AATCACTGGG 
AGTTACTTTG AAAATATTTA TGATGCGTTG 

50 

GTAAAGTTGG ATCCAAAAGA AAATATAGCG 



TTTTTATTCA ATGGGAAGAA AG TG ATT CCA 780 

AAAAACAATT TTCAATTGAA ACTGTTATTG 840 

CGAATTGGTT GAAATGGTTT GATATGGACA 900 

TGATTTTAAA AAATGATGAT ATTTATTTTA 960 

ATTCGGTTAT CATAAAAGAC GCACAAGCAA 1020 

CTATTTATCG CTTTGAACCA TTAGTATAAA 1080 

ATATGAATAA TGACAAGCAC AATGGAAAGA 1140 

TTAATTAAAA TGAAAAAGAT TGATAATATA 1200 

CTAAAGTAAT TCACGGTGAG AGGTTGACTC 1260 

ATTATAAATA GACTTAAAAT AATTGTTCTC 1320 

CATTATAACA GGGGTAATTG TATATGAACA 13 80 

CATTGAAGTC AAGTTGGTTC ATTTTAGAAA 1440 

TGAAGTATAG GTAACTAGTT GAAAAGTATT 1500 

AACATAGATA TAAAATGATT CACAATTAAA 1560 

ATTGGAGCAT GGACATTTCA AAAATAAGAG 1620 

CTATAAGATA CAAACTGTAT AATTAAAGGT 1680 

ATGGAAAAGA TGCATATCAC TAATCAGGAA 1740 

GGAGATTTAT TACAATTAAC GAAATGGGCA 1B00 

CGAAGAATCG CTGTAGGTCG TGACGGTGAA 1860 

AAAGTACCTA AATTACCTTA TACGCTATGT 1920 

AGTAATAAAG AAGCGTTAAA TGCATTGTTA 1980 

AAAGCGTATG CAATTAAAAT CGATCCTGAT 2040 

CAAAATTTGA AAGCGCTTGG TTTTAAACAT 2100 

TACATCCAAC CACGTATGAC TATGATTACA 2160 

AATAGTTTTG AACGCCGAAA TCGTTCAAAA 2220 

GTAGAACGAT CTGATAGAGA AGGTTTAAAA 2280 

GAACGCGATG GCTTCTTAAC GCGTGATATT 2340 

CATGAAGATG GAGATGCTGA ACTATTTTTA 2400 

AAAGTAAATC AAGAATTGAA TGAACTTCAT 2460 
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CAAAATATGA TTAATGATGC GCAAAATAAA ATTGCTAAAA ATGAAGATTT AAAACGAGAC 2580 

CTAGAAGCTT TAGAAAAGGA ACATCCTGAA GGTATTTATC TTTCTGGTGC ACTATTAATG 264 0 

5 _ 

TTTGCTGGCT CAAAATCATA TTACTTATAT GGTGCGTCTT CTAATGAATT TAGAGATTTT 2700 

TTACCAAATC ATCATATGCA GTATACGATG ATGAAGTATG CACGTGAACA TGGTGCAACA 2760 

ACTTACGATT TCGGTGGTAC AGATAATGAT CCAGATAAAG ACTCAGAACA TTATGGATTA 2820 

10 

TGGGCATTTA AAAAAGTGTG GGGAACATAC TTAAGTGAAA AGATTGGTGA ATTTGATTAT 2880 

GTATTGAATC AGCCATTGTA CCAATTAATT GAGCAAGTTA AACCGCGTTT AACAAAAGCT 2940 

AAAATTAAAA TATCTCGTAA ATTAAAACGA AAATAGATTA ACGACTGAAA TCTGAACGCT 3000 

75 

CATAAGACTG TCATTTGCGT TCAGATTTTT TTACACAATA TAGAATGGTT GAGTAAAATA 3060 

TTTTTGAATA TAGTGAAAGA GGGGGAAGTA CTGTGATAAA AAAGCTATTA CAATTTTCTT 3120 

2Q TAGGGAATAA GTTTGCTATC TTTTTAATGG TTGTTTTAGT TGTCTTGGGC GGTGTATATG 3180 

CGAGTGCTAA ATTGAAATTA GAATTACTAC CAAATGTACA AAATCCAGTT ATTTCAGTTA 3240 

CAACAACAAT GCCGGGTGCA ACGCCACAAA GTACCCAAGA TGAAATAAGT AGTAAAATTG 3300 

25 ACAATCAAGT AAGATCATTG GCATATGTGA AAAATGTTAA AACGCAATCC ATACAAAATG 3360 

CTTCAATTGT AACAGTTGAA TATGAAAATA ATACAGATAT GGATAAAGCA GAAGAACAGC 3420 

TTAAAAAAGA AATCGATAAA ATTAAATTTA AAGATGAAGT TGGTCAACCA GAATTAAGAC 3480 

30 GTAATTCGAT GGATGCTTTT CCGGTTTTAG CATATTCATT TTCAAATAAA GAGAATGACT 3540 

TGAAAAAAGT AACGAAAGTA CTGAATGAAC AATTAATACC AAAATTGCAA ACGGTAGATG 3600 

GTGTGCAAAA TGCGCAATTA AATGGGCAGA CGAACCGTGA AATCACCCTT AAATTTAAGC 3660 

55 AAAATGAACT TGAAAAATAT GGGTTGACTG CTGATGATGT AGAAAACTAT CTAAAAACGG 3720 

CAACSaGAAC AACGCCACTT GGATTGTTCC AATTTGGTGA TAAAGATAAT CAATTGTTGT 3780 

TGATGGTCAA TATCAATCTG TTGATGCTTT TAAAAACATA AATATTCCAT TAACGTGGCA 3840 

40 

GGAGGACCAA GGGCATCTCA TCCCAAAGTG ACCATAAACC AAATTCAGCC ATGTCAGACG 3900 

TTATCAGGCA TCACCACAGC AAATTCAAAG CGTCAGCnCC AATATATAGT GGATGCCGCA 3960 

nGAACTAGGG GTTTAGCGnT ATCAGTGGTG TGGCGACTCT ATTCTAAACG AT 4012 

45 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

CAATATAGGT CGCCGAGTTT CAACTaCATC AACTGGTTCA GTTACATTAG ATAATGCGCT 60 

5 

AGGTGTAGGT GGCTATCCTA AAGGACGAAT TATTGAAATT TATGGTCCTG AAAGTTCTGG 120 

TAAGACAACA GTAGCGCTTC ACGCTATTGC TGAAGTACAA AGTAATGGCG GGGTGGCAGC 180 

ATTTATCGAT GCTGAACATG CTTTAGATCC AGAATATGCT CAAGCATTAG GCGTAGATAT 240 

10 

CGATAATTTA TATTTATCGC AACCGGATCA TGGTGAACAA GGTCTTGAAA TCGCCGAAGC 300 

ATTTGTTAGA AGTGGTGCAG TTGATATTGT AGTTGTAGAC TCAGTTGCTG CTTTAACACC 360 

^ TAAAGCTGAA ATTGAAGGAG AAATGGGAGA CACTCACGTT GGTTTACAAG CTCGTTTAAT 420 

GTCACAAGCG TTACGTAAAC TTTCAGGTGC TATTTCTAAA TCAAATACAA CTGCTATTTT 480 

CATCAACCAA ATTCGTGAAA AAGTTGGTGT TATGTTCGGT AATCCAGAGA CTACACCAGtG 540 

2Q TGGACGTGCA TTAAAATTCT ATAGTTCAGT AAGACTAGAA GTACGTCGTG CAGAACAGCT 600 

TAAACAAGGA CAAGAAATTG TAGGTAATAG AACTAAAATT AAAGTCGTTA AAAATAAAGT 660 

GGCACCACCA TTTAGAGTAG CTGAAGTTGA TATTATGTAT GGACAAGGTA TTTCTAAAGA 720 

25 GGGTGAACTT ATTGATTTAG GTGTTGAAAA CGACATCGTT GaTAAATCAG GAGCATGGTA 780 

TTCTTACAAT GGCGAACGAA TGGGTCAAGG TAAGGAAAAT GTTAAAATGT ACTTGAAAGA 84 0 

AAATCCACAA ATTAAAGAAG AAATTGATCG TAAATTGAGA GAAAAATTAG GTATATCTGA 900 

30 TGGTGATGTT GAAGAAACAG AAGATGCACC AAAGTCATTA TTTGACGAAG AATAGTACAC 960 

AAATTTATAT CTATAGTTAA ACTTAGCAAA TATCCTTATA GGATTGATTG AAAGTGATAT 1020 

TCATCTCATA AAGCTAGAAT AATATCTAAC TTTATGGGAT ACACTACAAA TCGAGACTAT 1080 

oc 

AAGGTTTTTT ATTTTATTTA TTATTACATT ATCAATAGTT TTATAATCGA GCTTCAAAAC 1140 

tttaSaaaat AGTAGAAATA GCATTCAATA TAGTGCAAAA GTGCAAATTG ATAACTTGAC 1200 

ACTTATCTCC TATAAACCGT ACAATTAATT TGTATGATTT ATATATAATT TCATAAAGTC 1260 

40 

ATATTGAATT TCATATAAAG AGCAAACCCT AGAAAAGGAG GTGTTTGTGT GAATTTATTA 1320 

AGCCTCCTAC TCATTTTGCT GGGGATCATT CTAGGAGTTG TTGGAGGGTA TGTTGTTGCC 13 80 

CGAAATTTGT TGCTTCAAAA GCAATCACAA GCTAGACAAA CTGCCGAAGA TATTGTAAAT 1440 

45 

CAAGCACATA AAGAAGCTGA CAATATCAAA AAAGAGAAAT TACTTGAGGC AAAAGAAGAA 1500 

AACCAAATCC TAAGAGAACA AACTGAAGCA GAACTACGAG AAAGACGTAG CGAACTTCAA 1560 

5q AGACAAGAAA CCCGACTTCT TCAAAAAGAA GAAAACTTAG AGCGCAAATC TGATCTATTA 1620 

GATAAAAAAG ATGAGATTTT AGAGCAAAAA GAATCAAAAA TTGAAGAAAA ACAACAACAA 1680 
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CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 1800 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA 1860 

5 

GTTGATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AGCAGATCAC 1920 

ACAAGTGAAT CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC . 1980 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2040 

10 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

GCTAGAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACGATATTA TTAGAGAAGC AGGTGAACAA 2220 

15 

GCTACATTTG AAGTGAACGC ACATAATATG CATCCTGACT TAGTAAAAAT TGTAGGGCGT 2280 

TTAAACTATC GTACGAGTTA CGGTCAAAAT GTACTTAAAC ATTCAATTGA AGTTGCGCAT 2340 

2Q CTTGCTAGTA TGTTAGCTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACGAGCTGGA 24 00 

CTTTTACATG ATGTTGGTAA AGCAATTGAT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 24 60 

GGTGTAGAAT TAGCGAAAAA ATATGGTGAA AATGAAACAG TTATTAATGC AATCCATTCT 2520 

25 CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 2580 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 264 0 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 2700 

30 GCAGGTAGAG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 2760 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TACAATATCC TGGTCATATC 2820 

AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 2880 

35 TCACAAATTA GTGAGGGAGC TTTTTTAAGT TGTAGTCTTA AtCTAGTTAG ACAGCACTTT 2940 

ATCGGTAATA ACTATATTAA ACAGTAGTTA TTTGAAAGTA AGACGGACCT TATATTAAAT 3000 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 3060 

40 

GTAAACCTAT AAAGATGATT GGTTTTCTAT CCAATAAAAA AGAAGAGAAG ATGTAACACA 3120 

TCTTCTCTTC yGCAATATTA ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 

TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTGT TGCAATTGTT 3240 

45 

GAATATCTCT CTGCTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 3300 

TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3360 

GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTATTTTGA TGTCTTAATA 3420 

50 

CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 3480 
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GGATGAACAA AACATGAGAA TAATGTTTAT AGGGGATATC GTAGGTAAAA TTGGACGAGA 3 600 

CGCAATTGAA ACGTACATAC CTCAACTGAA GCAAAAGTAT AAACCAACAG TTACAATTGT 3660 

5 

AAATGCTGAA AATGCAGCAC ATGGTAAAGG TTTGACTGAA AAAATATATA AACAATTACT 3720 

) 

AAGAAATGGT GTAGATTTCA TGACTATGGG TAATCACACA TATGGTCAAC GTGAAATTTA 3780 

TGATTTTATA GATGAAGCAA AACGACTAGT AAGACCAGCG AATTTTCCGG ATGAAGCGCC 3840 

10 

GGGAATTGGT ATGAGATTTA TACAAATTAA TGATATTAAA CTTGCAGTTA TTAATCTGCA 3900 

AGGAAGAGCG TTTATGCCAG ATATTGATGA TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3 960 

GGAAGCACAA GAACAAACTC CGTTTATATT TGTTGATTTT CATGCAGAAA CAACTTCTGA 4020, 

15 

AAAGTATGCA ATGGGATGGC ATTTAGATGG TAGAsTAGCG CTGTTGTTGG AACGCATACA 4080 

CACATTCAAA CAGCAGATGA ACGTATTTTA CCAAAGGGGA CAGGGTATAT AACGGATGTT 4140 

2Q GGTATGACAG GTTTTTATGA TGGCATTTTA GGAATAAATA AAACAGAGGT AATTGAGCGT 4200 

TTTATCACTA GTTTGCCACA AAGACATGTT GTTCCAAATG AAGGTAGAAG TGTATTATCT 4260 

GGTGTTGTTA TTGATTTAGA CAAAGAAGGT AAAACAAAGC ACATCGAACG TATATTGATA 4320 

25 AATGATGACC ATCCATTTTC AACATTTTAA AATTACGTAA GTAAACATTC GAATTGGACC 4380 

CTATCGTCCA TTAGTATGAA TTTAATATAG TACCACTGTT TACATAGTAA ATCGGTGGTT 4440 

CTTTTTGTTA TCATTTAATA TGAAATATAT CCATAGGAGG CATATAACTA TGAAACCACA 4500 

30 ATTATCGTGG AAAGTTGGCG GTCAACAAGG CGAAGGTATT GAATCAACTG GGGAAATCTT 4560 

CGCTACGGCT ATGAATAGAA AAGGATATTA TTTATATGGA TATAGACATT TTTCAAGTCG 4620 

TATCAAAGGT GGACATACGA ATAATAAAAT TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4680 

35 TAGTGATGAT TTAGATATTT TGATTGCATT TGACCAAGAA ACAATTGATG TTAACCATCA 4740 

TGAAATGAGA GAAGACAGTA TTATTTTArC TGATGCCAAG GCTAAACCTG TGAAaCCAGA 4 800 

AGGATGTCAT GCACAGCTTA TTGAATTACC TTTTACAGCA ACCGCTAAAG AATTAGGTAC 4860 

40 

AGCATTAATG AAAAACATGG TTGCAATAGG TGCTACTAGC GCATTGATGA ATTTGAATAC 4920 

AAATACATTT GAAGAACTTA TTACTAATAT GTTTTCTAAA AAAGGTGACA AGGTAGTTGA 4980 

AGTCAATATC CAAGCATTAA ACGAAGGTTA TCAATTAATG CAATCTCGCT TACCTGAAAT 5040 

45 

CTACGGGGAC TTTGAATTAG AGTCAACAGA TGCACTACCA CATCTATATA TGATTGGTAA 5100 

CGATGCCATT GGATTAGGTG CAATTGCTGC AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

TACACCTGCG TCTGAAGTTA TGGAATATAT GATTGCCAAT ATATCTAAAG TAAACGGAGC 5220 

50 

GGTTATTCAA ACAGAAGATG AAATTGCTGC TGTAACTATG GCTATTGGTG CAAATTATGG 5280 
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15 



20 



25 



TGGATTATCT GGTATGACTG AAACGCCATT AGTCATTATT AATACCCAAC GAGGTGGACC 5400 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATGCAAATGA TTTATGGTAC 5460 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 5520 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTCTAAGTGA 5580 

TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 5640 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 5700 

CAAGCGTTAT GCGTtAACAT CCGATGGTGT TTCTCCTAGA CCTATCCCCG GTGTTAAAGG 5760 

AGGTATTCAT CATATAACTG GTGTGGAaCa CAATGAAGAA GGTAAACCTA GTGAATCTGC 5820 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5880 

ATCGCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA TCGGTTTTAT 5940 

TTCTACAAAA GGTGCAATTC AAGAAGGTAG TAACCGTTTG AATCAACAAG GCATAAAAGT 6000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AGCGTTATTC AAGATGCAGT 6060 

TAATAAAGCG AAGAAAGTCG TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

ACCTTTCCTA CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

30 GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT . GTTAAGCCTA ACTGGTGCCC 6300 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 6360 

AGAACCTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 6420 
TATTAATTCT TATGGCGTTC ATTCTATTCA CGGACGTGCA TTACCTTTAG CTCAAGGTGT * 64 80 

AAAAATGG CG AATAAAGATT TAACTGTTAT TGCATCGGGA GGAGATGGTG ATGGTTATGC 6540 

TATAGGTATG GGGCATACAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6600 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGCGC CTTTAGAATT 6720 

45 AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 6780 

AACAAAACTA ATTGAAGATG CAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6840 

ACCATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6900 

50 

TGTTGATGAc ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTGCTAAA AGAGATATCa AAATTaCTGA 7080 
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TGTATTTATA ACAGATCCAT TTATGCTACT CAGTTTTTTA CTATTACAAA AAATAAAGGA 7200 

GTTTTTAAAA ATGAAAGACA CATTAATGAG TATACAAATA ATTCCTAAAA CACCAAACAA 7260 

£ 

TGACAATGTT ATACCTTACG TAGACGAGGC GATTAAAATA ATTGACGAAT CTGGTTTGCA 7320 

TTTTAGAGTA GGTCCGTTAG AAACGACAGT ACAAGGAAAT ATGAATGAAT GTTTAATTTT 7380 

1Q AATACAATCA TTAAATGAAC GAATGGTGGA ACTTGAATGT CCAAGTATTA TTAGCCAAGT 7440 

TAAGTTTTAT CATGTGCCAG ATGGCATCAC TATTGAAACT TTAACTGAAA AATATGATGA 7500 

ATAACATTAA AAGTGAAGTA AACTGGATTT GAATTGGCTT GTTAGAGATG ACGTATAACT 7560 

1S TTAACTGTTT TTGCACTTTA TAGTTAAATT TAATATAATT ATTAAATGAT ACGGGCAAAT 7620 

AGAAAGGATT TTGTAAAGTG AACGAAGAAC AAAGAAAAGC AAGTTCTGTA GATGTTTTAG 7680 

CTGAGAGAGA TAAGAAAGCA GAAAAAGATT ATAGTAAATA TTTTGAACAT GTTTATCAGC 7740 

20 

CGCCTAATTT AAAAGCAAGC GCAAAAAAAG AGGTnAAA 7778 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
AGATGAAGTT GTTACgAAAA TTGCGTACGC TGTTTCAGAA CATGTCAAAA TAGAAACAGG 60 

35 

TAATCCATTC TTTCAAACAT CACATAGTGG TTGTGCGACG GGCGGATCCT GTAATTGTTC 120 
ATTATAAAAA ACATCGAGTC AGAAAAAGGT GGTTATTGAA cCACTAACTA GCATCTGACT 180 
CGATGTTTTT ATTTATTCGG GATTGTTTGT TTGAATTGTT GTGCTAAATC TGGTCGATCT 240 

40 

GTCACAATCG TGTGTGCACC TTTTTGGTAT AAATCATTCA TCAGATTTAT 1 ACTATTTACG 300 
CCATAATAGC CTGGAATGAT ATTCATATCA TTTAACCATT TGATAAAACG AGATGAAGTC 360 
45 AAATCAATGC CTTTAAAATG AGTAGGCATT TGGAACGTTT GTGCTAATGG TTGGTAGTAC 420 
CTACCACCTA ATAAATGATA TTTTAAAAAT GCTTCTGTAA CTTCCTGTTG GCTAGCACCA 4 80 

ATTGCGACGG ATCCTTGTGC AATTTTATTA AAACGAACGA TTTGTTCTTT ATAAAAACTT 540 

so 

GTCACAAGAA CGCGGTCAAA TGCTTGATTT TCTGCAATTG TATCAAACAT AATTTGTGGT 600 
GCGATTGAGC CTTCATAGGA TTCAGGAGCA TCTTTTAAGT CTACGTTTAT ATACATATCA 660 
GGATATTGCT TCAGCAACTc ATCGAAGGTT AGTATAGCTG TGTGTGCATG ACCACGATAT 720 
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AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTCGTTC TATCAACAGT TGCGTCATGA 840 

AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 900 

AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 960 

CGATGCGCAA ATATATATGG TGCATTGCCT TTGAAAAAAG CAGGGATGGT TTGCTTTTTA 1020 

GTAATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC ACCGACTAGT 1080 

ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 1128 
(2) INFORMATION FOR SEQ ID NO: 50: 

IS (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 60 

GATCATCCTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 120 

GCGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA 180 

AACAGTAATA TGCTGCTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 240 

TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 300 

ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 360 

TTAATTTTAT CTTGTTGCTT TTTATTAACA TCACCGGCAT ATTTTGTTGG CACGTCGACA 420 

ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGGCG 480 

ATTGTACTAT TTAAAGCTTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 540 

TTAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 600 

TTAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 660 

45 TGTTTATCTA GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 7 20 

AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 780 

GAGTTAAAGA TATCTTTTGT tTCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 840 

50 CTTGTATGAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 900 

GATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 960 

TATCTTTAAT TGAAAAAATA TGTATTCATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 1020 
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TCTATCAATA ATGCATCATT TTGGACGTTG TTAAGGATAG CTTTATCTAT AAATAACTGC 114 0 

ATAATTGGTT GTACTAATTT AGACGTAGGT ATCGTACGTA AAAGCATAAT AATTTCGTTC 1200 

5 

ACATACTTTT CTTTCTCAAT ATCATTTTTC ATATTGATTT GTTTGGGAGA GGTACATACT 1260 

TTAAGCATTA TCGGACATCT CGTTGTATAT ATTAAGTTTA TCATAACATG ATTTTATGTC 1320 

w GGGATAAAAA AATAACAGCA TCTTAACAAA TGTAAGATAC TGTCAGTGAA ATGAATGAAA 1380 

CTTTAGTTTC TGaTAATATA GTCAAAGGCA TTTAATGCTG CATTTGCACC AGCGCCCATT 1440 

GAAATGATAA TTTGTTTGTT CTTCTGATCT GTGACATCGC CAGCAGCAAA TATTCCAGGA 1500 

75 ACATTCGTAT TATTGTTACG ATCAATCACA ATTTCACCAC GTTCGTTTAA TTCAACAGCA 1560 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA CCAATTTGAA CAAAGATACC ATCTAAGTTA 1620 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT TCGTAACGTA TACCTGTAAC ATGGTCTTCT 1680 

20 

CCGACAACTT CAGTAGTTTT GGCATTTGTT TTGATATCAA CATTTGATAA AGAACGTAAA 1740 

CGATCTTGTA ACACGTTGTG TGCTTTTAAT TCGCTAGCGA ATTCGAATAA TGTAACATGA 1800 

TTAACGATAC CAGCAAGGTC AATTGCTGCT TCAACCCCAG AGTTACCGCC ACCGATAACT 1860 

25 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG TCACAGTGAG GGCAGAATGC AACACCTTTA 1920 

TTAATCAATT GCTCTTCACC TGGAATGTTT AGCTTACGCC AACCTGCACC. AGTAGCAATA 1980 

30 ATGACTGTTT TACTTTCTAA GACAGCACCG TTTTCTAACG TAACTTTAAT TGCTTCGTCA 2040 

GTCTTTTCGA TATCTGTAGC ACGTATACCT GTCATTGCAT CAATGTCATA TTGATCAATG 2100 

TGCGCTGCTA AGTTAGAAGA AAATTCAGAA CCAGTTGTTT CTTTAACAGT AATGAAGTTC 2160 

35 TCAATACCAG CAGTATCATT AACTTGGCCA CCGATACGAT CAGCAACTAT ACCAGTACGT 2220 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA CTACCACTAG CAGGACCACC ACCAACGATT 2280 

AAGACATCAT AAGGTTCTTT ATTTTCAAAC TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2340 

40 

GAAAGAATAT CTTGGATTGT CATACGACCA TTGCCAAATT CTTCGCCATT TAAAAAGACA 2400 

GCAGGGACTG CCATGATGTT TTCAGATTCT TCACGGAACA CTGCACCATC AATCATAGAA 2460 

45 TGCGTGATGT TAGGGTTGAT CACACTCATT AAGTTAAGTG CTTGAACGAC ATCAGGACAT 2520 

TTTTGACACG TTAAACTAAT GAATGTTTCA AAATGGAATG AACCTTCTAA TTTTTTAATT 2580 

TGGTCAATGA TTGACTGTTT TTCTTTAGGT GCACGACCAC TAACCTGTAA AATTGCTAAA 2640 

50 

ACAAGTGAGT TAAACTCGTG ACCTAATGGA ATACCTGCAA ATGTTACACC TGTTTCTTCG 2700 

CCAGGACGAT TGACTGAGAA ACTTGGTGTA CGTTTTAAAG ATTTTTCAGA AAGAGATAGT 2760 

CTAGGTGACA TATCAGTAAT TTCTGTCAAC AAATCTTTAA GTTCTTTGGA TTTATCATCT 2820 

55 



EPO 

TGTTGTTTTA AATCAGCATT AAGCATGGTT 
TCTAAACCAG GTTGCAATGT TTTAGCGCCT 

5 

GGGTTTTTAC GAACATATTG AGCTGCTTTG 
CCAATTCCGT CAGCGTTAAT TTCAGATGCT 
GTACCACGTT GAGCTAAACC AGTAGCTTCA 

10 

TGTGATGGGT CACCAATCAT AGTGTAAGTG 
CATGCTTTGT GTACGAAGTG AGTATCAGTT 
'5 TGTAATTCTT CATATTGGTT TTGTAAGTCT 

TCAGCAGGAT AGAAGCATAC TACGCTCCAA 
TTAAATTGAT CTTTTTTTGG ATCGAAArCT 

20 

ATTAATGACA TAAATATCTT CCTCCTAAGA 
TTGCGCTTAA TTATAATAAT TCTAATCTCT 
TAGTCAACTG GATAACTTTG TAAAGTGAAT 

25 

AAAGTGCTTT GATAATGGAT TTTGTAGTTG 
ATCTTGATTT TAATGTAAAA AATGTAAAAA 
30 . CTTATTGATA ATGGTATGAG AATATTTGTT 
TGGATTTTTA AAGTATGAGA CAATATTTTA 
AATTGCTATA AAAAAGTTTG GACGTGTACA 

35 

AAAGTATCGA GGAGTGGGTA ACGTGTCAGA 
TTCTGTTAGA AAATTTAAGA ATAAACCTTT 
AGCTCGACAA AGCGCTTCGA CGTCAAGTTT 

40 

CGATGAGAAG ATTAAAGAAA ATTTACGAGA 
TGGCTATTTA TTCGTCTTTG TTATTGATTA 

45 TGAAACTGAT ATGGAAAATG CATATGGTTC 

TGCAGCATTA GTTGCCGAAA ATATTGCGGT 
CTTTTTAGGA TCATTAAGAA ATGATGTTGA 

50 CTATGTCTTC CCGGTATTTG GTATGGCAGT 

AGGCAAGCCA CGCTTACCAT TTGACCATGT 
GGAAACACAG TATGCACAAA TGGCAGATTA 

55 
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GTAATGCCTC 


CTTAGATTTT 


ACCTACTAAA 


2940 


TCTTCCCATT 


TAGCTGGGCA 


TACTTCGCCA 


3000 


ATTTTGTGAG 


CTAATGTACT 


AGCGTCACGG 


3060 


TGTACAACAC 


CGTCTGGGTC 


GATAATGAAT 


3120 


TCTAATACAT 


CAAAATTACG 


AGTGATTGTT 


3180 


ATTTTGCTAA 


TTGCATCTGA 


ATGGTCATGC 


3240 


GATACTGAGA 


ATACATTTAC 


GCCTAATTTT 


3300 


TCTAATTCAG 


TTGGACAAAC 


GAATGAGAAG 


3360 


GAACCTTTTA 


AATCTTCTTG 


TGTAACTTCT 


3420 


TGCGCTGTAA 


ATGGTAAGAT 


TTCTTTGTTA 


3480 


ATTTAAGTAT 


GAATTAGAAC 


TATCAATTGA 


3540 


TAGTTAGCAT 


TATTACATTT 


TGATCCAGAA 


3600 


GATTACTTTT 


AAAATAAAGA 


AAGATAATAT 


3660 


ATGATTTAAA 


AGGTTGTGTC 


TATATTTAAT 


3720 


AAGAAGATTT 


GTATTCTCAA 


CTAAGTCAAC 


3780 


CGAGATGGAT 


GAAGGTAATG 


AGTGAGAAAC 


3840 


AAAAGTTCAA 


TTATTAACTT 


ATAAGCAAAT 


3900 


ATTGCAATAT 


GAAGATTTTA AATTAATTGT 


3960 


ACATGTATAT AATCTTGTGA AAAAGCATCA 


4020 


AAGTGAAGAC 


GTTGTTAAGA AATTGGTAGA 


4080 


CCTGCAAGCA 


TACTCAATTA TTGGTATCGA 


4140 


AGTTTCTGGA 


CAACCTTATG 


TTGTAGAAAA 


4200 


TTATCGTCAT 


CATTTAGTTG 


ATCAACATGC 


4260 


AACGGAAGGT 


TTGCTAGTAG 


GTGCAATCGA 


4320 


AACTGCTGAA 


GATATGGGGT 


ATGGCATTGT 


4380 


ACGCGTTCGA 


GAAATTTTAG 


ACTTACCTGA 


4440 


AGGGGAACCC 


GCAGATGACG 


AAAATGGTGC 


4500 


CTTCCATCAT 


AATAAGTATC 


ATGCTGATAA 


4560 


CGACCAGACA ATCAGCGAGT ACTATGATCA 


4620 
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CAAAGCAAGA TTAGATATGT TAGAACAATT GCAAAAATCA GGCTTAATAC AGCGATAgCA 4740 

AGATACCAAA ATAACCCGCC CCCCTCTAGC TTAAAATGAT AAGTATAGCT AGAGGGGGCG 4800 

5 

GGTATTTCTT GCAATGAATT AGTGTGAAGT TAATGCAGCA TTATCATTTG AATCGAAAGT 4B60 

ATCTTTATCC CAATGTTTAG TTAACTTGGC GGTACCTGTA CCAGCTAGCA TTGAATCGTT 4920 

CACGTTTAAT GCTGTTCTAC CCATGTCAAT CAATGGTTCA ACGGAGATGA GCACGCCGGc 4 980 

10 

TAAAGCGACT GGCAAGTTTA ACGTTGACAA CACCAATATG GATGCAAATG TAGCCCCGCC 5040 

ACCGACGCCA GCAACGCCGA ATGAACTAAT AATCACGACA GCGATTAACG TTACAATAAA 5100 

75 TTGTAAATCA ATTTCTACAT TAGCGACGGG TGCGACCATA ATTGCAAGCA TGGCAGGGTA 5160 

AATGCCTGCA CAACCATTTT GTCCAATCGA CAATCCAAAT GTCGCAGCGA AATTGGCAAT 5220 

ACCTTCTGGC ACGCCTAGAC GTCTTGTTTG TGTTTGTACA TTCAATGGTA AGGCACCCGC 5280 

20 GCTTGAGCGT GATGTGAATG CAAAGATTAA TACTTCCAAA GTCTTTTTAA CATAGCGAAT 5340 

TGGGCTAATA CCTAACAGGC TTAAAATAAT TAAGTGAATG ATATACATCG TAATTAATGC 54 00 

AGCGTACGAT GCGATTAAGA ATTTTCCTAA AGTCCAAATG GCGCCAAAGT CACTTGTCGA 5460 

25 

TAATGTGTTG GCCATAATTG CTAATACACC GTATGGCGTT AAACGTAAGA CGAACGTCAC 5520 

AATCGCCATT ACTAGTGAAT AGATAGCGTC AATCGCACGC TTAAGCAATT CACCATGATC 5580 

AGGTTGTTTG CGTnTACGCG TAAATAAGCA AATCCTATAA ACGAAGCAAA TATCACGACA 5640 

30 

GCAATCGTGG aAGTTGCACG TTGTCCaGTG AAATCTAAGA ATGGATTTTT AGGCAATAAT 5700 

TCCAAAATTT GTTGTGGTAA CGTATGTGCT GTTAAATCTT TCGCTTGTTT AGCAATTTCG 5760 

35 CTTCCACGTG CTTGTTCAGC GTTACCAAGG TTAATTGTTG ATGCATCTAA ACCAAACACC 5820 

AAGGCATACA CAACACCAAC AATCGCAGCA ATGGTGACAG TGCCAATTAA AAAGATAAAA 5880 

ATGA6ACTAC CAATTTTAGC AAACTTTTCT CCGATTTGAA TTTTAGTGAA TGCAGCTACA 5940 

40 

ATAGAAATGA AAATTAAAGG CATAACAATC ATTTGCAACA ATGCAACGTA ACCTTGTCCG 6000 

ACAATGTTGA ACCAGTCACT TGTTGATGTA ATAACATTCG AATGTGTGCC ATAAATAAGA 6060 

TGCAATAACA CACCGAATAC TATACCAATC CCTAAAGCTG TAAACACACG TTTCGCAAAA 6120 

45 

GATATATGTT TGCGAGCCAT CATGTGCAAT ATTACGATGA * AAATCACCAA TACAATAATA 6180 

TTAATCAGTG TAAGAAAAGC ATTCATGAAC GTCACTCCTT AAATTTTTGA ATATAATTCC 624 0 

So GACTAGTATG CT 6252 

(2) INFORMATION FOR SEQ ID NO: 51: 

' (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6730 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 





ATCAAATCnC 


AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTgAGAA ACAAAGTAAT 


60 


10 


GCTCTTTTTT TACCTCTTGT GGGTTGAAAA aTGGATCATC AGAGATAGAC 


TTCTTCTTTT 


120 




TCGAAGATGA 


CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA 


TCAAAAATGC 


180 




CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 


240 


15 


ATATATTTAA 


ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA 


AACGTTAATA 


300 




TTGGCGTTAT 


TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC 


CCACTTATAG 


360 


20 


AAAACTCTTA 


CGCATAGTTT ACATTAAAAT CAGACATTGA GGAATGATTT 


TTTAATTTCT 


420 


TCAGCTTTAT 


TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 


480 




GATATATAGT 


AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT 


TTTGAAGGGA 


540 


25 


ATGGAAATTT 


AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC 


600 




AAACGATTTA 


TTTATATGCT TATTTCTTTA TTTATTATTA TTACAATTAC 


ATTTTTCTTA 


660 




ATGAAATTAA 


TGCCAGGTTC GCCATTTAAC GATGCTAAAT TAAATGCTGA 


ACAAAAAGAA 


720 


30 


ATTTTAAATG 


AAAAATATGG ATTAAATGAT CCTGtAGCTA CGCAgTATTT ACATTATTTA 


780 




AAAAATGTTG 


TTACAGGCGA TTTTGGTAAT TCATTCCAGT ATCATAATCA ACCTGTGTGG 


840 




GATTTGATTA 


AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC 


AATGTTCaTC 


900 


35 


GGTGTGATAC 


TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA 


TTCTTGGGTT 


960 




GACTATACAA 


CTACAGTTAT TTCAGTTATT GCAGTATCTG TACCATCTTT 


TGTACTTGCT 


1020 


40 


GTACTTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC 


TGGATGGGAA 


1080 


GGTTTTTCGA 


CCGCGGTATT ACCGTCACTT GCATTATCTG CAGCTGTTTT 


AGCAACTGTC 


1140 




GCCAGATACA 


TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT 


TTTATTAGCG 


1200 


45 


AGAGCTAAAG 


GTAATTCGAC AATGCGTGTA CTTTTTGGAC ATGCACTTAG 


AAATGCTTTA 


1260 




ATTCCAATTA 


TTACAATTAT CGTTCCCATG TTAGCAAGTA TTTTAACAGG 


CACTTTAACA 


1320 




ATTGAAAATA 


TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC 


AATTACAACA 


1380 


SO 


AATGATTTCT 


CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT 


TATCGTTTCT 


1440 




ATTTTTATTG 


TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 


1500 




TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTGTCGATTA ACGACGATCA TTCTAATGCA 


1560 
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TGAATCAGGA ACCTGAAATG CAACGAGAAA GCAAAAACTT TTGGCAAGAT GCTTGGGCTC 1680 

AGTTAAAACG AAATAAGTTA GCTGTTGTCG GTATGATAGG TTTAATTATC ATTGTAATAT 1740 

TTGCTTTTAT CGGTCCAGTT ATAAATAAAC ATGATTATGC TGAACAAAAT GTAGAACATA 1800 

GAAATCTTCC GGCAAAAATA CCTGTATTAG ACAAAGTTCC ATTTTTACCT TTTGATGGTA 1860 

AAGATGCAGA TGGCAAGGAT GCTTATAAAG CAGCAAATGC TAAAGAAAAT TATTGGTTTG 1920 

GTACTGATCA GTTGGGTCGA GATTTATGGA CAAGAACATG GAAAGGTGCT CAAATTTCAT 1980 

TGTTTATCGG TGTTGTTGCA GCGATGTTAG ATATTTTTAT TGGTGTTGTA TATGGTGCGA 2040 

75 TTTCTGGATT CTTCGGTGGA CGTGTCGATA CGATTATGCA ACGTATACTT GAAGTCATAG 2100 

CATCTATTCC GAATTTAATT GTCGTAATTT TATTTGTATT AATTTTTGAA CCATCCATTT 2160 

GGACAATTAT ATTGGCTATG TCTATCACAG GCTGGTTAGG CATGAGCAGA GTTGTACGTG 2220 

GAGAATTTTT AAAATTAAAA AATCAAGAGT TTGTCATGGC TTCGAAAACA TTGGGGGCTT 2280 

CAAAATTCAA ATTGATATTT AAGCATATTT TACCTAATAC ATTAGGTGCT ATCGTGGTTA 2340 

CATCAATGTT TACAGTACCT AGTGCTATTT TCTTCGAAGC ATTTTTAAGT TTCATTGGTA 24 00 

TAGGTGTACC CGCACCTCAA ACATCGTTAG GGTCATTAGT AAATGATGGG CGCGCAATGT 2460 

TATTAATTTA TCCACATGAA TTATTTATAC CAGCAATGAT TTTAAGTTTA TTAATTCTAT 2520 

30 TCTTTTACTT ATTTAGTGAT GGATTACGTG ATGCATTTGA TCCGAAAATG CGTAAATAAA 2580 

AAGGGGGCAT AGCATATGAC TGAAAGAATA TTAGAAGTAA ATGATTTGCA TGTTTCCTTT 2640 

GATATTACAG CAGGGGAAGT GCAGGCAGTG AGAGGCGTAG ATTTTTATTT GAACAAAGGG 2700 

GAAACATTGG CAATTGTTGG TGAATCAGGT TCAGGTAAAT CTGTAACAAC AAAAGCAATT 2760 

ACAAAATTAT TCCAAGGGGA CACAGGAAGA ATTAAAAAGG GAGAAATTTT ATTTTTAGGG 2820 

GAAGATTTAG CAAAAAAACC TGAAAATGAG TTGATTAAAT TACGTGGCAA AGATATTTCA 2880 

ATGATCTTTC AAGATCCAAT GACATCTTTA AACCCAACGA TGCAAATTGG TAAACAAGTC 2940 

ATGGAACCAT TAATTAAGCA CAAAAATTAT AGTAAAGCAC AAGCTAAAAA GCGCGCATTG 3000 

GAAATACTAA ATCTTGTAGG TTTACCAAAT GCAGAAAAAA GATTTAAAGC ATATCCTCAT 3060 

CAATTTTCAG GTGGACAAAG GCAAAGAATT GTTATTGCAA CCGCATTAGC TTGTGAACCT 3120 

AAAGTGCTCA TTGCTGATGA ACCAACGACT GCATTAGACG TAACGATGCA GGCACAAATT 3180 

50 TTAGATTTAA TGAAAGAACT ACAACAAAAA ATCGATACAG CAATTATTTT TATAACGCAT 3240 

GATTTAGGGG TTGTTGCGAA TATTGCTGAT AGAGTGGCAG TTATGTATGG TGGTCAAATG 3300 

GTTGAAACAG GAGATGTTAA CGAAATATTT TATGATCCAA AGCATCCATA TACATGGGGA 3360 

55 
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GGAGCGCCAC 


CTGATTTATT 


ACACCCACCT 


AAAGGTGATG 


CATTTGCGAG ACGTAGcAAT 


3480 


5 


ATGCATTAGA 


TATTGATTTT 


AAAGTAGAAC 


CACCGTGGTT 


TAAAGTTTCA CCGACACATT 


3540 




TTGTGAAATC 


TTGGTTATTA 


GACGCACGTG 


CACCAAAAGT 


TGAACTACCC GAGCTGGTAA 


3600 




AACAACGTAT 


GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 


3660 


10 


CGTTCAATGA 


AAAATGATGA 


AGTGCTATTA 


TCTATTAAAA ATTTAAAGCA ATATTTTAAC 


3720 




GCAGGAAAGA 


AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 


3780 




AAACATTAGG 


TTTAGTAGGA 


GAATCGGGGT 


GTGGTAAATC 


TACAACTGGT AAATCAATTA 


3840 


15. 


TTAAACTTAA 


TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 


3900 




TTCGTAAACG 


TAAAGATTTG 


CTTAAATTTA 


ATAAAAAGAT 


ACAGATGATT TTTCAAGACC 


3960 




CATATGCGTC 


TTTAAATCCT 


AGGTTAAAAG 


TAATGGATAT 


AGTAGCTGAA GGTATTGATA 


4020 


20 


TCCATCATTT 


AGCAACTGaT 


AAGCGTGACC 


GAAAAAAACG 


TGTCTATGaT TTACTTGaAA 


4080 




CTGTTGGATT 


AAGTAAAGAA 


CATGCCAATC 


GCTATCCTCA 


TGAATTTTCA GGTGGaCAAC 


4140 


25 


GCCAACGTAT 


TGGaATTGCC 


CGTGcATTAG 


CCGTTGaACC 


AGAATTCATT ATCGCGGACG 


4200 


AACCAATATC 


GGCATTGGAT 


GTTTCAATCC 


AAGCTCAAGT 


AGTTAATTTA TTATTAAAAT 


4260 




TACAACGTGA 


AAGAGGGATT 


ACGTTCCTAT 


TTATAGCTCA TGATCTATCA ATGGTGAAGT 


4320 


30 


ATATTTCAGA 


TCGTATTGCA 


GTCATGCATT 


TTGGGAAAAT 


AGTTGAAATT GGACCGGCAG 


4380 




AAGAAATTTA 


TCAAAATCCA 


TTACACGATT 


ATACTAAGTC 


TTTATTATCA GCCATTCCAC 


4440 




AACCTGATCC 


TGAATCAGAA 


CGCAGTCGCA 


AACGATTTAG 


TTATATTGAt GATGAAGCAA 


4500 


35 


ATAATCATTT 


AAGACAATTA 


CATGAAATTA 


GACCGAATCA 


CTTTGTCTTT AGTACTGAAG 


4560 




AAGAAGCGGC 


ACAACTACGA 


GAAAATAAAT 


TGGTGACACA 


AAATTAAGGG GAAGGGGGAA 


4620 


40 


atgCaatgac 


GAGAAAATTT 


AGAACACTTA 


TTTTAATTTT 


GATTGCTACA ATTGCATTAA 


4680 


GTGGTTGTGC 


TAATGACGAT 


GGTATTTATT 


CAGATAAAGG 


TCAAGTATTC AGAAAAATTT 


4740 




TGTCATCAGA 


CTTAACATCC 


CTTGATACAT 


CATTAATAAC 


GGATGAAATA TCTTCTGAAG 


4800 


45 


TGACTGCGCA 


AACATTCGAA 


GGTTTATACA 


CATTAGGAAA 


AGGTGACAAA CCGGTGTTAG 


4860 




GTGTTGCGAA 


AGCTTTTCCT 


GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 


4920 




GAAGCGATGC 


TAAATGGAGC 


AATGGTGACA 


AAGTGACTGC 


ACAAGACTTT GTTTATGCTT 


4980 


50 


GGAGAAAAAC 


AGTTGACCCT 


AAAACAGGTT 


CTGAATTTGC 


ATACATTATG GGGGACATTA 


5040 




AAAATGCGAG 


TGATATTAGT 


ACTGGTAAGA 


AACCTGTAGA 


GCAATTAGGT ATCAAAGCAT 


5100 




TAAATGATGA 


AACATTACAA 


ATTGAATTAG 


AAAAGCCGGT 


TC CAT AT ATT AATCAATTAT 


5160 
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10 



20 



25 



ACGGTACGGC AGCTGATAGA GCGGTATACA ATGGTCCaTT TAAAGTTGAT GATTGGAAAC 5280 

AAGAAGATAA AACCTTACTA TCTAAAAATC AGTATTATTG GGATAAAAAG AATGTAAAAT 5340 

TAGATAAAGT GAATTATAAA GTTATTAAAG ACTTACAAGC CGGTGCATCA TTGTATGATA 54 00 

CTGAATGAGT AGATGACGCA TTTATTACTG CAGATCAAGT AAATAAATAT AAAGACAACA 54 60 

AAGGATTAAA CTTTGTGTTA ACGACTGGGA CATTTTTTGT AAAAATGAAT GAAAAACAAT 5520 

ATCCTGATTT TAAAAACAAA AATTTAAGAT TGsTATCGCA CAAGCAATAG ATAAAAAAGG 5580 

ATACGTTGAT TCAGTGAAAA ACAATGGCTC AATTCCTTCC GATACACTAA CAGCCAAAGG 5640 

75 AATTGCGAAA GCGCCTAATG GCAAAGATTA TGCGAGTACC ATGAATTCGC CTTTAAAATA 5700 

TAATCCTAAA GAAGCAAGAG CACACTGGGA CAAAGCTAAA AAAGAGTTAG GTAAAAATGA 5760 

AGTGACATTT TCAATGAACA CAGAAGATAC ACCAGATGCA AAAATATCTG CTGAATATAT 5820 

CAAATCGCAA GTTGAGAAAA ATTTACCAGG AGTTACTTTG AAAATTAAGC AATTACCGTT 5880 

TAAACAAAGA GTATCACTAG AACTGAGTAA CAATTTTGAA GCATCACTTA GTGGTTGGTC 5940 

TGCAGATTAC CCTGATCCTA TGGCTTATTT AGAAACAATG ACCACAGGTA GCGCACAAAA 6000 

TAATACAGAC TGGGGTAATA AAGAATATGA TCAATTACTT AAAGTAGCAA GAACCAAATT 6060 

GGCACTTCAA CCGAACGAAC GATATGAAAA CTTGAAAAAA GCAGAAGAAA TGTTCCTAGG 6120 

30 AGATGCACCG GTAGCACCAA TTTATCAAAA AGGTGTtGCA CATTTaACAA aTCCTCAAGT 6180 

AAAAGGATTA ATTtACCATA AATTTGGTCC AAATAACTCA CTTAAACATG TATATATTGA 6240 

TAAATCGATA GATAAAGAAA CAGGTAAGAA GAAAAAATAA TATGCTTTGT AAATTAGGCT 6300 

GGAGACATAT CTCCAGTCTT TTTGTGTTGG ATAAAAaCTT TGGGAATAAA AATTTAAAAT 6360 

AAGTCGTTTT TTAAATTACT GAAATTGATT AAATGCATAA ATAACTGAAT ATTCTAAAAA 6420 

TAAACTTGTA ATAATTTTTT CTATGAGTAA ACTAAAAAGA AAAAATTAGA TTGAAAGTAG 64 80 

GAGGCATATG TATGGGGAAG CTAATTAAAT ATATTTCAAT ACTTCTTATT GTCGTTTTAG 6540 

TGTTGAGTGC TTGCGGAAAA AGCAGTAATA AAGATGAAGG AGTAAAAGAT GCTACTAAAA 6600 

CGGAAACCTC AAAACATAAA GGTGGTACCT TAAATGTAGC ATTAACAGCA CCGCCAAGTG 6660 

GTGTTTATTC TTCGTTATTA AATAGTACAC ATGCAGATTC TGTAGTTGAG GGATATTTTA 6720 

ACGAAAGCTT 6730 
50 (2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6482 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : double 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



s 


AATTTTTGTC 


ATTATTAAAA 


ACCTCGCTTT 


TAAAAGATTG 


AAAAGTAAAT 


GAGTGAAATT 


60 




AAAGATTATG 


CACATTAAAA 


TCACGCCACA 


ATTTAATTGT 


GAAAAATATC 


ACAAATATAT 


120 




TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTTAT TGCAGAAAAC TTATAACAyG 


180 


10 


TGCACAAGTT 


ATAGTGAATT 


GCAAACGGAT 


TACTTTAGTC 


TTTTTAAAAC 


ATGAAGTATA 


240 




ATTTGTATAG 


CAATAAATAT 


AAAAATGGGA 


GGCTATGTTC 


AATGAGCAAT 


ATGAATCAAA 


300 




CAATTATGGA 


TGCATTTCAT 


TTCAGACATG 


CGACTAAGCA ATTCGATCCA 


CAAAAGAAAG 


360 


15 


TTTCGAAAGA 


AGATTTTGAA 


ACAATATTAG 


AGTCAGGTAG 


ATTGTCTCCA 


AGTTCTCTTG 


420 




GGTTAGAACC 


TTGGAAGTTT 


GTCGTGATTC 


AAGATCAAGC 


GTTACGTGAT 


GAATTAAAAG 


480 


20 


CGCACAGTTG 


GGGCGCAGCA 


AAACAATTAG 


ATACAGCGAG 


CCATTTTGTG 


CTAATTTTTG 


540 


CGCGTAAAAA 


TGTAACGTCA 


AGATCACCGT 


ATGTACAACA 


TATGTTAAGA 


GATATTAAAA 


600 




AATATGAGGC 


ACAAACGATT 


CCAGCTGTTG 


AACAAAAATT 


CGATGCATTC 


CAAGCAGATT 


660 


25 


TCCATATTTC 


TGATAATGAT 


CAAGCCTTGT 


ATGACTGGTC 


AAGTAAACAA 


ACGTATATTG 


720 




CATTAGGCAA 


TATGATGACG 


ACAGCCGCAT 


TGTTAGGTAT 


TGATTCATGT 


CCGATGGAAG 


780 




GTTTTAGTCT 


GGATACAGTG 


ACAGACATTT 


TAGCAAATAA 


AGGGATCTTA GATACTGAGC 


840 


30 


AATTTGGTTT 


ATCAGTGATG 


GTCGCATTTG 


GCTACAGACA 


ACAAGAGCCA 


CCGAAAAATA 


900 




AAACACGCCA 


AGCTTATGAA 


GATGTTATTG 


AATGGGTTGG 


ACCAAAAGAA 


TAAATAGAAT 


960 




ACCGTATGTC 


TAAATATATA 


AAATTAAAAA 


GTTAGCAATA 


AAAAAGCCTG 


CGATTACATA 


1020 


35 


AATGAATCGC 


AGGcTTTTGC 


GTGAAAAAAT 


TGTATTAATA 


AAGTATGGAT 


GATTATTTTT 


1080 




CTGGSACAAG 


GTCAGTATTT 


GAATGAACTG 


TGATGTCAAA 


CCCTTCTGGT 


GCCGTAAATG 


1140 


40 


TATGTGTTGA GGCGTCGGGT TGATAAATAT 


CAACATGTGT 


TAATCCATAA 


CTTTGTGAAT 


1200 


TGTTTTGTCT 


TGCTTGATTG 


GATTGCCAAG 


TATTAGCAGC 


AATATGATGG 


TGATAATGAT 


1260 




TCGTTGACAT 


AAATAGCGCA 


CGTGGAAAAT 


CAGACACATG 


TTGGAATCCT 


AATTGTTCAA 


1320 


45 


TGTAACATTG 


ATATGCTGCG 


TCTAAATCAT 


GTGTTTTTAA 


ATGTAAGTGT 


CCAATCATGC 


1380 




CTTTTGCTGG 


CATTCCTTGC 


CAACCTTCAT 


CAGTACGATG 


TGTTAATAAG 


GTTTGGCTAT 


1440 




CAACTTCTAA 


AGTATCCATT 


TTAACTTTGC 


CATTTTGCCA 


TTCCCATGAA 


GATGAAGGTC 


1500 


50 


TATCGCGATA GACTTCAATA 


CCATTACCTT 


CGGGGTCGTT 


GAAATATAAA 


GCTTCACTTA 


1560 




CTAAATGATC 


ACCAGCGCCG 


ATGCCCATAT 


TTTTTTGTGC 


CACGAAATAT AAGAAGTTAG 


1620 
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aAGTCTGACG GcCGTCTTCT AATAAATGTA ACGTTAGAGT ATGGcCACCA GTCCCAACAG 1740 

ATAATACGGT TGTATTATCG TCAGAACTTT TAACGGATAG TCCTAAAATG TTTTTGTAAA 1800 

5 

ATGTTGTCAT TAAGTCTAAG TCTCTTACGT TCAGTACAAT GTTTGTCACT TGTGTTGCTG 1860 

TTTTATCGTG AAATGCCATT ATGCATCGCC TCTTTTTCTA TTTTTCTATA AGTTAGTATA 1920 

AAAAGTATAC CAGAAAAGAA AATGAATTGA TAGCATAAAG TTTGAAATGC AAAATAACTA 1980 

10 

GTCGTTTTGC AATTTTAtAT TGATGCGAAC AAAAAAGCGA TGGTACAGTT GCACCATCGC 2040 

AAAATTTATT TAACCAAGAT ATACATCTTG ATATGAATCT TCTTTTTCTA ACATATGTTT 2100 

75 GGCAAATGAA CATGAGGCAA TAATTTTCAA ATTATTTTCT CGAGCGTGTT CAACAACTGc 2160 

TTTAAGTAGT TTTTTGCCAA CACCTTGACC ACCAAGTTCA TCAGATACGC CTGTATGATC 2220 

AATGTTAATT TCATTATTAT CCACAAAACG GTATGTGATT TCAGCTAAAG CATTATTTTC 2280 

20 ATCATCACCA ATATAGAATT TGTTCTCGCC TTGTTTGATT TCAAGGTTAC TCATACATAT 2340 

CAACTCCTAT CATGATTGAT TATAGTATTT CCCTATTCTA TTTTAACTTA AACGAAGTCA 2400 

AAGGTGCATG ACAGTCATGT GACGACATTG CCACATCTAT GTAGTCGTTT TTATTAAGCA 2460 

25 

CAGTTTGAAA TGAAGATGAA AACACGTATC TTGACATTAA ATCTATTCAG CTATATAATT 2520 

TATCTCGAAA TCGAAATAAA ATAAAAAAGT TGGTGATCAT ATGGATCGAA CGAAACAATC 2580 

TCTCAATGTT TTTGTCGGAA TGAATAGGGC GTTAGACACA TTAGAGCAAA TTACAAAAGA 2640 

30 

AGACGTAAAG CGATATGGCT TAAATATTAC TGAATTTGCA GTGCTCGAGT TGCTTTATAA 2700 

TAAAGGTCCG CAACCAATTC AACGTATTAG AGACCGCGTA TTAATTGCAA GTAGCAGCAT 2760 

3S TTCATATGTT GTAAGTCAAT TAGAGGACAA AGGTTGGATT ACACGTGAAA AGGATAAAGA 2820 

TGATAAACGT GTATATATGG CTTGTTTAAC TGAAAAAGGT CAAAGTCAAA TGGCAGATAT 2880 

TTTCCCTAAG CATGCTGAGA CATTAACAAA AGCGTTTGAT GTGTTAACAA AGGATGAATT 294 0 

40 AACAATCTTA CAACAAGCGT TTAAGAAACT AAGTGCACAA TCTACAGAAG TGTAAGGCGT 3000 

GCACTAAAAA TTTACATTAA- AGTATCTCGA TTTCGAGATA AATGCACTAA AAATATAAAG 3060 

AGGGTATATA AAATGATAAA TAATCATGAA ' TTACTAGGTA TTCACCATGT TACTGCAATG 3120 

45 

ACAGATGATG CAGAACGTAA TTATAAATTT TTTACAGAAG TACTAGGCAT GCGTTTAGTT 3180 

AAAAAGACAG TCAATCAAGA TGATATTTAT ACGTATCATA CTTTTTTTGC AGATGATGTA 3240 

GGTTCGGCAG GTACAGACAT GACGTTCTTT GATTTTCCAA ATATTACAAA AGGGCAGGCA 3 300 

50 

GGAACAAATT CCATTACAAG ACCGTCTTTT AGAGTGCCTA ACGATGACGC ATTAACATAT 3360 

TATGAACAGC GCTTTGATGA GTTTGGTGTT AAACACGAAG GTATTCAAGA ATTATTTGGT 3420 
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TTAAATGAAG GGGTAGCACC TGGTGTACCT TGGAAGAATG GACCGGTTCC AGTAGATAAA 3540 

GCGATTTATG GATTAGGCCC CATTGAAATT AAAGTAAGTT ATTTTGACGA CTTTAAAAAT 3600 

5 

ATTTTAGAGA CTGTTTACGG TATGACAACT ATTGCGCATG AAGATAATGT CGCATTACTT 3660 

GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG GTAATCTTAA TAAAAGATGA TAAAGGGCCa 3720 

GCaGCACGTC AAGGTTATGG tGAGGTACAT CATGTGTCAT TTCGTGTGAA AGATCATGAT 3780 

10 

GCAATAGAAG CGTGGGCAAC GAAATATAAA GAGGTAGGTA TTAATAACTC AGGCATCGTT* 384 0 

AATCGTTTCT ATTTTGAAGC ATTATATGCA CGTGTGGGGC ATATTTTAAT AGAAATTTCA 3900 

75 ACAGATGGAC CAGGATTTAT GGAAGATGAA CCTTATGAAA CATTAGGCGA AGGGTTATCC 3960 

TTACCACCAT TTTTAGAAAA TAAAAGAGAA TATATTGAAT CGGAAGTTAG ACCTTTTAAT 4 020 

ACGAAGCGTC AACATGGTTA ATTGGAATGA GGAGGATTTG TGATGGAACA TATTTTTAGA 4080 

20 GAAGGACAAA ATGGTGCGCC AACACTAATA TTATTGCATG GTACAGGTGG TGATGAGTTC 4140 

GATTTATTAC CGTTAGGCGA AgcATTGAAT GAAAATTATC ACTTGTTAAG TATTAGAGGA 4200 

CAAGTTTCAG AAAATGGGAT GAACCGTTAT TTCAAACGTC TTGGTGAAGG TGTTTATGAT 4260 

25 

GAAGAAGATT TGGCATTTCG TGGACAAGAA TTGTTGACGT TCATTAAAGA AGCTGCTGaA 4320 

CGTTATGATT TTGaTATTGA AAAAGCAGTA CTTGTTGGAT TTTCAAATGG ATCAAATATA 4380 

GCGATTAACT TAATGTTGCG TTCAGAAGCA CCATTTAAAA AAGCATTGTT ATATGCACCG 4440 

30 

TTATACCCAG TTGAAGTAAC GTCAACAAAG GATTTATCAG ATGTCAGTGT GTTGCTTTCT 4500 

ATGGGGAAAC ATGATCCAAT TGTGCCATTA GCTGCAAGTG AACAAGTCAT TAACTTGTTT 4560 

35 AATACACGTG GGG CACAAGT CGAAGAAGTT TGGGTGAAGG GCCATGAAAT TACAGAAACT 4620 

GGATTAACGG CTGGTCAACA AATACTTGGG AAATAACAGT TCTATTAAGA AGCGGACAGA 4680 

TGGAAAAGAT TTTTACTTTT CATCTGCCCG CTTTTTTGAT TTTGAAGTGC TGTACTAAAT 474 0 

40 TTTACAATAG TATAGATATT TTAATCGATA TGAGATTTGC CGGTAATACG CTTAATTAAA 4800 

CCTTTATAGA GTACAGGTAT GAGTAAGATG AAACCGAACA ATCCCATAAT AGGGAATACT 4860 

TTTCCAATTA ATGAAATGAa ACCGATAAAT GTACTAATAT AAGTGATGAC AGCCATTGTA 4 920 

45 

ATAATAATGA TGAAGTAACG TCTGCTGAAT GGAACGCTGA AACGTGACGC AAATGCATAC 4980. 

ATTAATCCAA CAACAGTATT GTAGATGACA AGTATCATAA TGACAGACAT AATAATACCA 5040 

ATTGACGGAG ACATTTGTGT CGCTAATTTT AATGTAGGTA GATCTACGTG TTTAATTTTA 5100 

SO 

TCGAATTGAG AAATTAAACC TAGATTAATC ATCATGAGTA AAAATGTAAT GATTAAACCG 5160 

CCAATCAAGC CCCCGTATAA CGTTGAGTCA CGATATTTAA CTTTACTACC CATCACTGAT 5220 
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70 



15 



CCAGGTGATA ATGATTTCTG CTTATGAATC TGAGCATCAT TATTAGCGGC AGTAAAATCA 5340 

AGATGACTTG TTGTGAAATA GTAGACCGCA ATCATAATGA CAATCGCAAT TAAAAATGGG 5400 

GTAACACCGC CAAGCACAGC AATTAAACGA TCGAATTTTA GAAACAGTGT TGCTAAAATA 5460 

AAGGCGACTA ATATGAGTGC GCTCAGCCAA TACGGTAAGT TGAAACTTTG ATGAATGGTT 5520 

GACGCACCAC CTGCAGTCAT AATAATAGCT AAAGACAACA TAAACATTGT TAAAATAATA 5580 

TCAAAACCTC TTGCAATAGA GGGGTATAAG AAATAGTTAA TTGAATCAGA ATGATTTCTG 5640 

GACTTTAGAT GATGACCTGT ATGCATGACA ACCATTCCAC CTAAAGTAAT CAATAGTCCT 5700 

GTTACAATAA TGCCTGAAAT GCTATATGCG CCATGACTTG TGAAAAACTG GAAAATTTCT 5760 

TGACCAGTAG CAAAGCCGGC ACCAACGACA ACACCAACAA AGGCAAATGC CACAATAATG 5820 

GACTCTTTTA AGATACGCAT GATTTAAAAA TGTCCCTTCG TAATTTTAAG TAATATAGAA 5880 

20 AATGTAACAT ACATGTTAAT GAAAAATATA GTACTAATAT AGTATTTTGT TAAATTGGAG 5940 

TAGAAGCGAG GGTGTCGGTC ATTTCATTAA TTTATTAGTT GATTTTGCAT TTTTTTGCTG 6000 

TAAAGTTGTT ATAATACAGT TAACAGGAAT TAGCATAGAT ACACCAATCC CCTCACTACT 6060 

CGCAATAGTG AGGGGATTTT TTTCGGTGTA GCTAGGTCGC CTATTTATCA TCGTGTTTGC 6120 

GTAgCaATGC GTAAACACAG TACCACTAAA TAAGTGCACG ATACATGCAT CAAATGTCGT 6180 

CTTTAGTcTA AGTAACGATC ATGCATTAAC ATTTTCAAAA TATCTATTTG AGCTTGAAGA 6240 

TCTTTACCAA TATTGGTATC ACGAATCTTC TTACGTTGTA ATTCTTTATC TACGACGCGC 6300 

TTTATAGAAA GTTCATCGAT ACCTTCGGAA AGTATTTTTn CTTTAGCGTT AAATTGTTGG 6360 
35 TGTGCAACGA GTTGCATACC GAATGAATTA TACAATAGTG TATAGCCTGC AATGCCAGTn . 6420 

GTTGACTGAT AAGCTTTTGA AAAGCCACCA TCAATGACAA GCATCTTTCC ATCAGCCTTG 6480 

AT - 6482 
40 (2)* INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16592 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRAND EDNESS : double 

<D) TOPOLOGY: linear 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

50 

ATTTAAGGCG ATTGCTTGTG TATTTCTCTC TTTTGTAGGC AAACCTGCAC TCGTTCCAAA 60 
AAATGTAACT TCCATATATG CCCCTCCTTT TCTTCAATTC ATTTTATCAT AAAATTTGTA 120 
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10 



15 



20 



25 



30 



35 



40 



45 



SO 



AATTTTTCTA ACTTTAACGT AGACATAACT ATATAAATTT TGATAATTAC GTTATACTTA 240 

TCATTAATAA GTATCACATT AAACATGATA CATGAATCGA TATTTCATTT AAGACACTGC 300 

ATACAGTCGA GCATATTGTA TGACCTACTG AATGGATTAT CTTATAATAA TAAATCATAT 360 

ATCTAATTAA GAATTGAGGT TTTAATCTTG AGTACTAAAA ACAAACACAT CCCATGTTTA 420 

ATCACAATCT TTGGTGCACT GCGTGACTTA AGCCATCGTA AGTnGTTTCC ATCAATATTC 480 

CATCTCTACC AACAAGACAA TTTAGATGAA CATATTGCCA TcATCgGTAT TGGACGTCGT 54 0 

GACATkwnTA ATGATGATTT CCGTAATCAA GTAAAATCAT CAATTCAAAA GCACGTAAAA 600 

GATACAAACA AAATTGACGC GTTTATGGAA CATGTCTTCT ATCATAGACA TGATGTTAGT 660 

AATGAAGAAA GCTATCAAGA ATTACTAGAT TTTAGTAATG AATTAGATAG CCAATTTGAA 720 

TTAAAAGGTA ATCGACTATT CTATTTAGCA ATGGCACCAC AATTCTTTGG CGTTATTTCT 780 

GATTATCTAA AATCTTCTGG TCTTACTGAT ACAAAAGGAT TTAAACGCCT TGTTATCGAA 84 0 

AAACCATTCG GTAGTGATTT AAAATCAGCC GAAGCATTAA ACAATCAAAT TCGTAAATCA 900 

TTTAAAGAAG AAGAAATTTA TCGTATTGAC CACTATTTAG GAAAAGACAT GGTTCAAAAT 960 

ATCGAGGTAT TACGTTTTGC GAATGCGATG TTTGAACCAT TATGGAATAA CAAATATATT 1020 

TCAAACATCC AAGTTAGATC TTCTGAAATA CTAGGTGTTG AAGATCGTGG TGGTTATTAT 1080 

GAATCAAGTG GCGCGCTAAA AGATATGGTG CAAAACCACA TGTTACAAAT GGTTGcATTA 114 0 

TTAGCTATGG AAGCACCTAT TAGTTTAAAT AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 1200 

GTACTTAAAT CACTGCGTCA TTTCCAATCT GAAGATGTTA AAAAGAACTT TGTTCGTGGT 1260 

CAATATGGCG AAGGCTATAT CGATGGTAAA CAAGTTAAAG CATACCGTGA TGAAGATCGC 1320 

GTTGCAGATG ACTCTAACAC ACCTACCTTT GTTTCAGGTA AATTAACAAT TGATAACTTT 1380 

AGATGGGCTG GTGTACCATT CTATATTCGT ACTGGTAAAC GTATGAAATC TAAAACAATT 1440 

CAAGTTGTCG TTGAATTTAA AGAAGTACCA ATGAACTTAT ACTATGgAAA CTGaTAAACT 1500 

GTTAGATTCA AACCTATTAG TAATCAATAT CCAACCTAAT GAAGGTGgTA TCTTTtACAT 1560 

CtAAATGcTA AGaAAAATAC ACAAGGTATC gAAACAGrAC CTGtCCmATT GtCTTACTCm 1620 

ATGaGCGcTC aAGaTAAAAT GaATACTGTA GATGCATATG AAAATCTATT ATTTGATTGT 1680 

CTTAAAGGTG ATGCCACTAA CTTCACGCAC TGGGAAGAAT TAAaATCAAC ATGGAAATTT 1740 

GTTGATGCAA TTCAAGATGA ATGGAATATG GTTGaTCCAG AATTCCCTAA CTATGAATCA 1800 

GGTACTAATG GTCCATTAGA AAGTGATTTA CTACTTGCTC GTGATGGTAA CCATTGGTGG 1860 

GGACGATATT CAATAATTGA ATTAAAACGC ACATGTTAAA CAAAAATAAA TGAGCGAATG 1920 
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TATATTATGA AATTATATTT TACAATGCCC 
GGTGTATAAT TTATAGAAAT AATGTAGAAT 
5 AAGTTTTGGA CGTTATCAAG CAAATACAAC 

TAGAGCCTTT CAATTTGCAA GAATTAAGTG 
ATCAATCATT TAAAATGATT GTCGGCTTAT 

70 

TGACACTCGC TGCAAATGAT GTGATTAATG 
AATATCACTA TGCAAATTCA AATGATTTTG 

1S CACCTATTCA AGCCTCTACT AAAAAAGATG 
AATTATCAAC TACTGAGAGA GCACCTTATC 
CATTGGTTGG ATATGCACGA TTTATAGACA 

20 CGGATTTTTT AGAAGACTTG CTCATTGATG 
ACGTTAGTCC ATTTGAACTA TTTGTTATTA 
TTGTAGGTGT ACCAAGTGAA CGTTATCCTG 

25 AACATTGTGC GAAATTCAAT TTACAAGGTG 
ACTATATTGA ATCAAGTTTG CAGTTAACAT 
AAGTGTACCC TCTCGATATT TCATTTAATG 

30 

CTGTTAAACA GAGTCCTTAT GACGAAGATT 
ATATCTATAG GTCTACAAAT GGCCTTAGAT 

35 TAGTTCGATA ACACATGCTT CATATGGACG 
ATTAAATAGC TTTACTTCTC CATGGCTTAA 
GTTAGTAAGA TTACCTACAA TAAGAACTTG 

40 AACTTGTGAA TTTTCAGCAT CTACTAAATC 
CTTTCTTAAT TGAATTAAAT CTTTATAAAA 
TTGTGCAACA TTGATAGTTT TATAATTCGG 

45 AAATCCTCCA TTTAACGTAT CATCCCATTG 
ATCTTTATAT TTCGCAAGTA AAGCGTCTAC 
GTCATTTTTA ACAGCAACAT CGTTAAACGT 

so 

ACCAATTTCT TGACCTTGAT AAATGAATGG 
ATGACTTGTT GCTGATTCAT ACCAATACTT 

55 



AAAACTATTT TAATAATCAT TGAACAAATG 2040 

AAAAATAAAT GATTGAATTA ATTGGAGTGA 2100 

AGGCAATTGT TTATATTGAA GATCGTTTAT 2160 

ATTACGTTGG TCTTTCGCCA TACCATCTTG 2220 

CTCCAGAAGC TTATGCACGC GCGCGTAAAA 22 80 

GTGCTACACG ACTTGTAGAT ATCGCTAAAA 2340 

CAAATGATTT TAGTGATTTT CACGGCGTAT 2400 

AATTACAAAT TCAAGAGCGA TTATATATCA 2460 

CATACAGATT AGAAGAGACA GATGATATTT 2520 

CTAAGTATTT GTCACATCCT TTTAATGTTC 2580 

GTAAAATTAA AGAGTTACGA CGATATAATG 2640 

GTTGTCCTCT TGAAAATGGT TTAGAAATAT 2700 

CACACTTAGA AAGTCGATTT TTACCTGGCA 2760 

AAATTGATTA TGCAACTAAT GAAGCTTGGT 2820 

TGCCATATGA ACGAAATGAT TTATATGTTG 28 80 

ACCCATTCAC TAAAATTCAG CTTTGGATTC 2940 

AAATAATAAA AAACAAAGAA GCCCCCTAAT 3000 

TCTATTAGGG GGCATATTAA TATGTTAATT 3060 

TAACTGTTTT AAATTAACTT TGGCATCATA 3120 

ATCAAATGGT ACAGTTAATT CTGCTTCGTG 3180 

CTTTTCATTT AATGTTCTCG TGTACGCAAA 3240 

AAATTGACCA TATACGTATA CATCATTAGA 3300 

TTGTAATACT GAATGCTCAT CTTCTAATTG 3360 

ATTCACTGGG AACCACGGTT CACCATTTGT 3420 

CATTGGTGTG CGAGAATTAT CTCGGTTCTC 34 80 

ATCTCCACCT TGAGCTTTCA CTATTTGATA 3540 

TTCAATACTT TCAAATGGAT AATTCGTCAT 3600 

CGTACCTTGT TGCAAGAAAT AAACAGCTGC 3660 

GTCATCGTCA CCCCACGTCG ATACACGTCG 3720 
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CCATCTATTT AATACAGATT TATACGAATT TACATCAAAG TGAGAATCAC CACTATTCCA 3840 

CAGTCCCAAA TGTTCAAATT GGAATATCAT ATTAAATTTA CCATTTTCTT CCCCGACCCA 3900 

GTCATCAGCA TCATCAGGGC TTACACCATT CGCTTCACCA ACAGTCATAA TGTCATACTT 3 960 

ACTTAATGAG CGATCTTTCA TCTCTTGTAA CCAAGTTTGT ATACCTGGCT GATTCATATC 4020 

TACATCAAAT GCTGGGGCAT ATGTTTTACC CTCAGGTACA GGTAAGTCAC CCGCTTCAAA 4 080 

CGTCTTCTTA ATATGCGTAA TTGCATCTAC TCTAAATCCA TCAATGCCTT TATCAAACCA 4140 

CCAGTTCATC ATTTCAAATA CAGCATCTCT AACTTCCGGA TTACCCCAAT TCAAATCAGG 4200 

TTGTTTTTTA CTGAATAAAT GGAAATAATA TTGCTCAGTA TTAGCATCAT ATTCCCATGT 4260 

AGATCCATTA AATATACTTT CCCAGTTGTT AGGTTCAGAG CCATCTGGCT TTGGATCTTG 4320 

CCAAATGTAC CAATCACGTT TGGGATTGTC TTTACTAGAT TTGGATTCTA TAAACCAAGG 4380 

20 ATGTTCATCA GATGTATGAT TTACAACTAA ATCTAAAATA AGCTTCATGC CTCTATCATG 4440 

AACACCTTTT AATAAACGAT CAAAGTCTTC CATCGTTCCA AATTCATCCA TAATCTCTTG 4 500 

GTAGTCACTA ATATCATAAC CATTGTCATC ATTAGGTGAT TTAAACATTG GACTGAGCCA 4560 

25 AATGACATCG ATACCGAAAT CTTTTAAGTA GTCCAATTTA TCAATCATTC CAGGTAAATC 4620 

CCCAATACCA TCGTGATTAC TATCATTAAA ACTTCTTGGA TATACTTGAT ATGCTACTGC 4680 

TTCTTTCCAC CATTGCTTAT TCATTTTAAA ACTCCTTTGC TATCGCTGTG TTGATTTTCT 474 0 

TATTTTTAAT TCTGTATCTA TAATGACGAG TTCAATAACA TCCTGTGCTT TGTTTTTCAA 4800 

TATATTTAAA ATTGCTGCAC CAGCCTGTTG ACCTAACATT CGAGGCTTGA TGTCAATACA 4860 

GGTTTGTGGT GGTGACGCAA TTTCGGTTAA ATAAGAATCA TTGAACGTTG CTGTCATTAC 492 0 

ATCTTTCGGA ATTTCAATAT TAAGTTCATA TAGGACACTT AAAATCGCTA AATGTAACAT 4 980 

AGCATCTAAC GAAATGATTG CCTGTTTAAT ATTTGGGTCC TTCAAACGCG TATGTAGATT 5040 

40 TTGCATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC TCAATAATTT GATAATTAAT 5100 

TTTATTTTGA GAAGCTATCG TTTCAAATCC TTGAATTCTA TCTTTTGAAA CTTCAAAATT 5160 

TCCTTTTTCT GTAATAAATA TTAATTCATC TACACCTTGT TCAATAACAT GTCGTGTCAA 5220 

45 ATTTTCAGAA GCTAATATAT TATCATTATC TATATGTGTA AATTGATGAT CTATATCCGA 5280 

TGTAGGCTTA CCAATCACAA TAAATGGCAT GCTTTCATCA ATTAACATTT GTTTAATCGG 5340 

ATCATTTTCT TTTGAATAGA GCAGTATAAA CGCATCAACC ATTCGTTGTT TAATCATTTT 5400 

50 

. ATAAACTTCA TCCATTAAAT CATTCATATT ATTTGAGACT GTCGTTTGTG TACCATAGCC 5460 

ATGCTGGTTA CACGTTTCAG AAATTCCTAG CAATACATTG ATGTAGAATG GATTCAGTCG 5520 

55 



30 
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AGTTCTAGCA GCGGTATTAG GAAAATAATT CAATTCTTCC ATAACTTTCT TCACTTTTGA 5640 

AATTGTCGCT TCGCTAATAC GTTGATTTCC TTTTATAACT CTTGAAACTG TCGAAGGAGA 5700 

5 

AACACCGGCT TTTAGTGCAA CATCTTTAAT CGTAACCATT TAATCACCTC CTGTTAATTT 5760 

CTGCATCGGA AAACGCTTCC AACCACTGTA TAATACCAGT TTAGTCACAC TTTCTAAAAA 5820 

AGTCAAAAGA TTTGTGCAAA CGATTGCATA AAACGATAAA AATAAAACCT TCATACTGAA 58 80 

10 

ATTCAATCCG AAAATCAATA TAAAGGTTTG TATAAATATT AAAATCGATT GTTTAGTCAC 5940 

TAACTGCAAA ATAGTTACCT TGGCCATCTT GAAAATTAAA TACACGTTGA CCATTCATTT 6000 

15 CTACTATATC ATGCCCAGTT AAACCTAAAT CATTTAATTT TGAGTATAAT GCATCAAAGT 6060 

TTTTCTCTTT AAACATTAAA GATGGTGTTC CTAGGTTCAC TTCCGGGCTA TGCTTTTCAA 6120 

TAAATTCTTT TGCCATAATC GTCAATGACG TTTCAGCATC TTTGGTAGGT GATACTTCAA 6180 

20 CTGCAACATA GTCCTCAGCT AACGGTGTTT CACTTACAAC AACAAATTCT AAAGTTTCTG 6240 

TCCAAAATGC TTTCGCTTTT TCGACATCAT CAACATATAA CATAACTTGA TTTAACTTTT 63 00 

CCATAAAATA GTACCTCTAT TTCTCTATAG TACATGCTAT CATAACACAG TAAATATTTT 6360 

25 ATTACTTCAC AAAATGCTTA AAAATATGGC GGGATGCTTT TAAGGTCAAG GATAATACTT 6420 

GTGTAATTTT TTATAGGTTG TAGCTACTCT ATCACACTCT CTTTTATATT TATCAAAAGA 64 80 

TATAAAAAAG GATAGTATCT TTCAACTATC CTTTAATCAA TATTATTCTT CAATCCATTG 6540 

30 

TGTATGGAAT ACGCCtTCTT TATCTTTTCT TTCGTACGTA TGAGCACCGA AGTAGTCACG 6600 

TTGTGCTTGA ATTAAGTTTG CAGGTAAATC AGCAGCACGG TAACTATCAT AGTAATTAAT 6660 

ACTTGATGAG AAACCAGGTG TTGGTACACC ATTTTGAACA CCAGTTGCGA CAACATCACG 6720 

35 

TAACGCATCT TGATATTCAG TAACGATGTT TTTAAAGTAA GGATCTAGCA ATAAGTTTTG 6780 

TAATCCTGGA TTATTATCGT AAGCATCTTT GATCTTTTGT AAGAATTGTG CACGGATAAT 6840 

40 GCAACCTTCT CTCCAAATCA TAGCTAAATC ACCAAGTTTT AAATTCCATT CATTATCTTC 6900 

ACTTGCTTTA CGCATTTGcG CGAAACCTTG TGCATAAGAA CAAATTTTAC TCATATATAA 6960 
TGCTTTACGA ATTTTTTCTA AAAAGTCTTT CTTGTCACCA TCAAATGATG CTTTTGGACC . 7020 

45 ATTTAATTCT TTAGAAGCAT TTACGCGCTC TTCTTTGaTT GAAGAGATAA AACGTGCAAA 7080 

TACAGATTCA GTAATGATTG TTAATGGAAT ACCTAATTCT AATGCGTTAA TTGAAGTCCA 7140 

TTTTCCTGTA CCTTTTTGaC CTGCAGTATC AAGAATTTTT TCAACTAATG CTTCTTTATT 7200 

50 

TTCATCTAAT TTCATGAAAA TATCACCAGT GATTTCAATT AAATAACTTT CTAATTCACC 7260 

AGCATTCCAG TCTTTGAACG TTTGAGCAAT GTCTTCATGA GACATGCCTA ATAATTCTTT 7320 
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CATTTTCACA TAGTGTCCAG CACCATTAGG TCCAATATAA GTAACACATG AAGCACCGTC 7440 

TTTTGCCTTT GCAGCAATTG CATCAAGAAT ATCTGCAACT TTGTTATAAG CTTCTTCTTG 7500 

TCCACCCGGC ATTAATGACG GACCAGTTAA CGCTCCAATT TCACCACCAG AAACGCCCAT 7560 

ACCAATAAAG TTGATTGCAC TTTGTGywAA TGCTTTATTA CGTCTGATAG TATCTTGATA 7620 

GTTTGTATTA CCACCATCAA TTAAAATATC TCCATCATCT AATAAAGGTA ACAAACTATC 76 80 

AATCGTTGCG TCCGTAGCTT TACCTGCTTG AACCATTAAT AAAATTTTAC GTGGTTTTTC 7740 

TAAAGAATTA ACAAATTCTT CCAATGAATA CGTTGGATGA ATATTTTTCC CTTTTGATTC 7800 

TTCAACCATT AAATCAGTTT TTTCACTTGA GCGGTTAAAT ACAGATACAC TATATCCGCG 7860 

TGATTCAATA TTCCAAGCTA GGTTTTTACC CATAACGGCT AAACCAATAA CTCCAATTTG 7920 

TTGTGTCATA TTACTTACCT CACTTGTTGA TTTTTCATTA GTATTGTATC ACAAAATAGA 7980 

CATACACTAC ACTAAATCAT TTCGAATGTC GCGCAACTAT TTTGATTATT TCTAACACTT 8040 

GACTTGCAAG CAAGTTCAAT GATTTAATCG GCATTCTCTC ATTTGTTGTA TGGATTTTTT 8100 

CATAACCCAC TCCTAAAATG ACTGAAGGAA TACCAAATGT ATTAATAATA CTGCCGTCTG 8160 

25 AACCGCCACC AGAAATAATT GTATTTGCAG ATAATCCTAA ATTACGAGCA CTTTCTTGTG 8220 

CAATTTTAAC AACCGCTTCA TTATCATTAA TTTTAAATCC TGGATAACTT TGCTCCACTG 8280 

TAACTACTGC TTTCCCACCT AATTCTGATG CAGTAGTTTC AAACACATCA GTCATATGTT 8340 

TGACTTGTGT TTTTATTCTT TCTGGATCGT GAGAACGTGC CTCTGCTTCT AAAATGACTT 84 00 

CATCTGCAAC AATATTCGTA GCTGAACCGC CATGAAACTT ACCAATATTG GCAGTAGTTA 8460 

TTTCATCAAC TTGTCCTAAT TTCATTCGAC TAATTGcTTT CGCCGCAATA TTAATAGCAC 8520 

TAACACCCTC TTTTGGCGTA CTTGCATGAG CCGTTTTGCC AAAAATTTTA GCTGAAATTA 8580 

ACATTTGCGT CGGTGCACCT ACAACCGTAG TACCGACATC AGCACTTGCA TCAATAGCAT 8640 

AACCAAAGTC CGCGTCCAAC AACTCTGAAT TTAATTCTTT AGCACCAATT AAACCTGATT 8700 

CTTCTCCAAC AGTAATCACA AATTGAATTT GTCCATGTGG GATTTGTTGT TCCTTTATCA 8760 

CTTGCAAAAC TTCAAGCATC GCTGATAATC CTGCTTTATC ATCTGCACCT AGAATAGTCG 8820 

45 TACCATCAGA GTATATGTAG CCGTCATCTT TTACAATTGG CTTTACATTA ATTGCGGGTA 8880 

CAACAGTATC CATATGGCTC GTCAAATATA ATTTAGGTAC TTCGCCTTCT TCGATAGTAC 8940 

TATTCATTGT ACACACTAGA TTATTGGCAC CTAATTTAGG ATGTTTAGCC GCTTCATCTT 9000 

CTTTAACATC TAACCCTAAT GCTATGAATT TTTCTTTTAA AATAGGTTGG ATTGTTGATT 9060 

CATTCCCTGT CTCAGAATCG ATTTGTACAA GTTCAAAAAA CGTATTAAGT AATCTTTGCT 9120 



55 



30 



35 



40 



50 



408 



EP0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



GATGAAATAA AATGTTACAG TAATTGACGT TACACAGATT TATCAGGTTT GTAAATTGTG 9240 

TCATATTATT TTCAATTTAT TATATATAAT TATTGTAACT CAAACTAAGC TTTGTCAAAA 9300 

ATATATTGAT TGATTTTTCA AAGATATCGT ATAATGAGGA AAATGACATA AGCAAACTTA 9360 

CTCATGTTTT TTATTATATT CCTTTATGAT GATTGCTAGT TATATCGTCT CAAGTTAAAA 9420 

GTTTTATATC TTATGTCGTA ATTATTAATA CAAAGGTTAT TCATTTGGAG GCACACAAAA 9480 

TGCAAAATAA AGTTTTAAGA ATTATCATTA TCGTTATGCT TGTATCAGTT GTATTAGCAT 9540 

TGTTATTAAC GAGTATCATT CCAATTTTAT AAACTATATC TCAACTACCT ATACAAAATC 9600 

ATACAATTAA AAATCCATCC ATTATAAACG CATGTATTAA TAAGTTATCG TATTGCAACG 9660 

ATTACTTTCA AACATGGGTC ATACGGATGG ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

TTCAATGAAC CAAATTGCGA TTTGATTTGT AAATATTCTT CTAATTCATT TAATATTTGA 9780 

ATAATACTTG CTCTCGAGTT AAGCGCTTTG TGTGTTGTTG GCAATGGCAG TTCATCCAAT 9840 

TTCAAACGCG TCTCATACAA ATTGTGTAAA CGCATTGCTG TATAGTCATT ACTATTCACA 9900 

TTTAGACCAA TTTCTTTCAG CAGTGACGCA ACATCATTTA AAAGCGGATC TTTATGACAG 9960 

ATACTTTCGA TGAGCGGTTT CATTCTCATT AACAATTCCA CTTGCTCTTC TCGCATATCA 10020 

AAATAATGAT AGTATGAATT TTCGTTTCTA ACAAAATGAT TTTTAACATC TCGGAACGCG 10080 

ATAGACTtCG CCTTTTTAAT ATTTAAAAGT AACACTTCAA ATTCAATCGC AATGGTATCT 1014 0 

TCATATTTTT CACAAATATA ACTATATTTA CTAAAAATAT CAGCAATTTG TTGCTCAATT 10200 

TTACATTTGT ATTCGTCtAG TTGTTTGTCT AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 
GCAATGCTTA GTCCAATTAA CAGTAATAAT GTTTCATTAA CAATTAAATG TGCATCAATT ' 10320 

GATTTTGCAT TAAAAACATG AAGTAATATA ACGCAACTCG TAATGACACC TTCTTGTACT 10380 

TTTAATACGA CAGTTAATGG TATAAATAAC AATACGATAA TACCGAGTAC AATTGGACTC 10440 

TGACCTAATA AACTAAATAT TGCTGAACCT AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

CTTGAAATAA TCGCTTGTAG CGAATGTACT TTTGTATGTT TAATACATAA TACGACTAAT 10560 

ATGGCGCTTG AAGCATAATT ATCTAAACCT AACAGCTTAC TAATAATTAC ACCTAAAGTC 10620 

ATACCCACTG CTGTTTTTAT TGTTCTAAAT CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

GGTTAGCGCC TCTTATCTTT CTTCACAATA TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

TACGTTCATC ACATCATGAC CTTCGATTTG ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

ATCTTTTACT AATGCAAATG ACGGACTTGA AGGCGCATAA CCTTCGAAGT ATTCACGCGC 10860 

TCTTTGTGTC GCTTCTTTAT CTTGTCCAGC AAATACTGTC ACTAGACGAT CAGGTAATAC 10920 
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AGAATTGATC ATAACTAGTG TTGTACCATC TTGTTTAAGA ACTTTGTCAA CATCTTCTGC 11040 

AGTAGTTAAT TGCTCATATC CCGCAGATTC AATTTCATTC CTTGCTTGTT CTACAACACC 11100 

5 

GTTCATGTAT AAATCGAAAT TCATGnCCAT AAGTTCAATC ACCTATCCCT TTATATTTAA 11160 

ACTAtCCTCA TTCTACTAAT TAATAACATA TTGTTCAATA AACTAATCTG AATCACACCT 11220 

ATATTTAGAC ACAATTTTAA CAATATACCA AACATTATTG TGCTTAAAAT CATGGTAACT 11280 

10 

AATTTGTTCA CATGTTTTCA TTAATATGTT TCAAGTATGA TGTCTTATTT TGACTTTACT 11340 

GCAAAAATGC ATTCAACCAT GTTGATTATT GTTCTTTATC TTTTTTGAAT ATATTGCACA 11400 

15 TATTTTAGTG CCAAAAAATA ATACATCCAT CGACAAGAAC AAGATAAAAC AAGTTGTCGA 11460 

TAGATGCATC TATGTTATCA CTAATATATA TTTGTATTTT CTAAAGTATA CTGTTCGATA 11520 

CGCTGTTTAA TATGATTCAT ArATTTACCT GTTTGTAAAC CATCTAAAAT ACGATGATCA 11580 

20 ATTGAAATAC ATAAATTAAC CATGTTACGA ATTGCAATCA TATCATTAAT TACTACTGGC 11640 

TTTTTAACGA TTGATTCTAC TTGTAAAATC GCTGCTTGTG GATGATTTAT AATACCCATT 11700 

GATGATACTG AACCAAATGT ACCAGTATTA TTTACCGTAA ATGTACCGCC CTGCATATCT 11760 

25 

TCAGCTGTCA ATTGCTTATT ACGCGCTTTC GTTGCTAAAG TATTAATTTC TCTAGCTATA 11820 

CCTTTGATTG ACTTTTCGTC TGCATGCTTA ATCACAGGTA CGTATAATTT ATTTTCATCA 11880 

GCAACAGCAA TTGAAATATT AATGTCTTTA TGTAAGACAA TTTCATTTCC TTGCCAGCTA 11940 

30 

CTATTTAATA AAGGATATGC TTTTAAAGCA TCTGCTACAG CTTTTACAAA GAAAGCAAAG 12000 

AACGTTAGAT TATATCCTTC TTTATTTTTA AAGCTGTTTT TATAATGATT TCTCGTATTC 12060 

35 ACAAGATTTG TAGCATCTAC TTCAATCATC ATCCATGCAT GTGGAATCTC TGTTACACTA 12120 

TTAACCATAT TTTGCGCAAT TGCTTTACGC ACACCATTTA CTGGTATTGT GCTGTTTTCA 12180 

CTATTGTCTT CAGATGATTG GTTACTTGAT GTATCTACTG ATGTTGATTT TGTTTGAACT 12240 

40 TGTTTGTCAG ATTGAGCTGT GGTACCACCA TTTTCAATAA CTGACATTAT ATCCTTCTTA 12300 

GTTACACGAC CTTCAAATCC ACTACCTACA ACTTGTGATA AATCAATGTC ATGCTCTGAA 12360 

GCGAGTTTAA ATACAACAGG TGAAAAGCGA CCATTATTAC GTGGTTGATT TTGTTTAGCA 12420 

45 GTAGATGTCT GTTCCACTGT TGCACTAGCT TTTTTAGTAG ATTTCTGAGT ATGCTCATCC 12480 

ACTTTTGCTT GTATCTCTTC AGTTGTTTCA TTTGTCTTTT CATCAGCAGT TTCAATTTTA 12540 

CAGATAATTG TATCAATAGC TACTGTCTGC CCCGCTTCAA CTAAAATTTC TGTAATTGTT 12600 

50 

CCTGATATCG TGGAAGGGAC TTCAGCTGTC ACTTTATCTG TAATAACTTC ACATAATGGT 12660 

TCATATTCAT CAATATGATC ACCAACAGAA ACTAACCATT GTTCAATGGT GCCTTCATGA 12720 
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AATTCACGCA 
GGAGAAAATG 
TCGAACAAGC 
TTATCTTCAG 
TTATCTAATG 
AAAATATCCG 
TCTTCACCTT 
ACTTCTTCCT 
GATTCGATAG 
GTTAAACCTG 
TGAACACCGC 
CGATAACGCA 
GCAAATTGAA 
ACAATATTTG 
AGTCCTTGAG 
ACATCTTTAT 
ATTTAGCCAT 
TGGATATGGC 
TTCTATTTCT 
CTTTTCATTG 
GTCAHTCATCT 
TTGACCAGAA 
ATTACCATCT 
AGCTGCGTAT 
TACAAAAGGA 
TGAGCTACCT 
TAAAGCAGCA 
ATTCTTAGCT 
TTTCTTTCCA 



TTTTATTTAA 
GCATAGATGG 
AATGCTCTGC 
TTACAAGTAA 
GATAAACAGT 
CTGCTTGTAA 
CACGTTTCAC 
TTAAGAAACG 
ATGATAATAA 
GCGATGAAGC 
CACCAAATGG 
TTTTCGCAGC 
TTTCTGCAAT 
ACTCAGCTAA 
TAGTACCAAA 
TTTGTTGTAA 
TAGTTAAGAC 
GCGTCTTCAG 
GCCAACCAAG 
CAGTCTGCTT 
GATGAATGAG 
ATAGCTCGAT 
ACTTGTTCAC 
TGTAATGAAT 
AGTTTGTGTA 
TCACCAACAG 
CCAACAGCAT 
CTACTACTAA 
AACGCTGATA 



GATTTTTTCT 
TACAtCTGGA 
AATAATCGCT 
AACTTTACCT 
TCGTAAATCA 
ACAATAATTG 
ATCTGCTTTT 
ATAAGCTTTT 
AAGCCCTTTA 
AAATATACTT 
TGCACGAATC 
TTCACTAATA 
TGGTCTTTTA 
TGGCGTATCG 
TACGCCACCT 
TGCTAAGTCT 
TCCCTTCTTC 
CAGCCTTTGT 
CATCATCGAT 
TTTTAAGcGT 
CTGTCATACG 
CTCTTGCTTC 
CATGTATACC 
CAGGTACTGA 
CACCCGCGAA 
TTGCTGTTGC 
GGGGTATTTG 
AGTGTGATGG 
AAAACGTATC 



GGATTCATCA 
GCAGCTAAAC 
GACACTTCTG 
GTATGTTTAG 
ACGACTTCAA 
ACCATTAATC 
CCTAAAGGTA 
TTATGCTCAA 
GCATCATACG 
TCAATACTTT 
GTTAATGGGC 
ATTTGATTTG 
CCTACCATAG 
ATAACTCTGT 
TTTCTACCAA 
TGTGCCtGcG ■ 
GTACACAAAT 
CGCTTTATTG 
AATGCCAGCT 
TTCACGCTCT 
ACTTGTTACT 
TTTCATCGCT 
GTAACCAAGT 
AATTGCATAT 
GTTTAAACCT 
AATTTTCTTC 
AGTTGCTACC 
CATTTGTTTT 
ATACGCTGAG 



TAATTTCATT 
GCATGATTGG 
ACATAATACT 
CACGATCAAT 
CATTGATACC 
CATAACAAAA 
CAGTGTAATA 
AGTACAATAC 
GTGTGGAAGG 
GTGAATGATA 
ATTGCCAATC 
TCGCAGGTAA 
CTGCACCAAT 
CTTCACCATA 
CATCTTCACC 
■ TATCGCCTCT 
GCATAGGCTT 
ATGATGTCTT 
GAAAGCAACT 
TCTTTCGTAC 
GCTTCAATCA 
TTATACATTG 
GCTCTATCCG 
TTATTATTTA 
TCATGGAAGT 
TTACCATCCA 
GGTGAACTTT 
CCACCAGAGT 
ATACCCATAT 



TTCTAATACA 

TGCATCTAAA 

ACCTTCTAAA 

AATTGTTTCT 

GTCTGCAGCT 

TACTGTTAAA 

TTCTTCTGGC 

TGGATCATTT 

AATAACAATT 

TAGTCCTCCG 

ATTATTTGAA 

AATAAAATCT 

GGCAGTTCCA 

TTTTTGTTGC 

AAGAATAAAC 

AAATAAGATA 

CTTCGACACT 

TnATgTCCGC 

CTTTTTTGAA 

GATATTGGTC 

AAGTTGAACC 

CTAATGGATC 

ATAATTTTTC 

TAATGACACA 

CACCTTGGTT 

TTTTTAAAGC 

GAGACAAAAT 

TAACATCGTC 

AAGTAACGAA 



12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 

14040 

14100 

14160 

14220 

14280 

14340 

14400 

14460 

14520 
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AATCTGAGTT GCTTCTTGTC CTTGACCACT TACAACAAAT GGAATTTTAC CTGCACGGTT 14640 

CAATAACCAC AGTCTTTCAT CTATTTTTCT ACCTAAATCC ATCCATTTAT ATATTACTTT 14700 

5 

TAGGTCTTCT TCGCTAAGGC CTAATGATTT ATAATCAATC ATGTTAAATC CTCCTATTTA 14760 

TACGTGAATA GCTCTACTTT CTGCTTTCAA TCCTAATTCC ATCAACACTT CAGAGATGGA 14820 

AGGATGTGCG TGTGTTGTTA GTCCTAATTC TAATGCCGAG CCATTCATGA ACTGTAACAG 14880 

10 

TGATGCCTCA TTAATCAATT CTGTTACATG TGGACCAATC ATATTAATAC CCACAATTTC 14940 

TTCAGTTGAT TGATCAATCA CCATTTCGCT ATACCCTTCG TTTGTGTCAT GGCTATCAAT 15000 

1S CACTGCTTTA CCAATTGCTT TAAATGGTAC TTTAAAACTT TTAACTTTCA TTCCCTCTGC 15060 

CTTTGCTTGT TCAATGTTTA AACCGATAGA AGCAATTTCA GGTTGTGAAT AAATACACTT 15120 

AGGCATCATG TTATAGTTTA CTGGGATTGG GTTCCCCTCA AACATATGAT CAACAGCCAC 15180 

20 AACACCTTCT TTTGATCCAA CATGTGCCAA TTGTAATTTT CCTATACAAT CACCAGCTGC 15240 

ATAAATATGT TTATCTTCAG TTTGTTGAAA TTCGTTCGTT AAAATATGTC CTGATGTTGa 15300 

AAGtTTTATT TTAGTGTTGT TTAAACCAAT ATCTGATGTG TTAGGTTTTC TACCAATCGA 15360 

25 

TAGCAACACT TTATCTACTT TAATTATGTC TGAGGAAATT TCAAACGTAA CACCATCTTC 15420 

GTTAACATTT ATATCATTTT CAGAAAGTTT TATTCCCTCA TAGAATTTAA CACCACGTGC 15480 

TGACAATGAT TTTTTTAATA GTTGTGAAGC TTGTTTACTT TCAGTTGGTA AAATTCTTTC 15540 

30 

ACCTGCTTCT ATAACTGTTA CGTCAACACC TAAATCTATC ATCAATGATG CAAATTCCAT 15600 

TCCGATAACA CCACCACCAA TAATACCAAT ACTTGATGGT AACGTCTTTA ATGATAATAT 15660 

35 ATCATCGCTA GATAAAATTT TATCATGATC AAATGATAAG AATGGCAACT CTGCAGGCGA 15720 

AGAACCAGTT GCAATTAATA CAAATTGGTT GGGTAATAAG TCTGATTCAC CATCTTCATA 15780 

TTCGACAGAA ATTGTGCCAC TTTGAGGTGA AAATATAGAT GTACCTAGAA TACGTCCCGT 15840 

40 GCCATTATAA ATGTCAATGT GATTGTGTTG CATTAAATGC TTTACACCTT GATACATTTG 15900 

ATTAATAATG TCTTCTTTTC GTGCCAACAT ATTTTCAAAA TTAACATTAG CATCTTTGAC 15960 

ATCAACGCCA AACATTGCTG CCTGTTTTAC TGTTTGAAAT ACTTCAGCAG ATTTAAGCAG 16020 

45 CGATTTAGTA GGAATACAAC CTTTATGGAG AGAAGTACCT CCTAATAGTT GTCGTTCTAC 16080 

TATTGCCACT TTTTTACCTA ATTGAGACGC ACGTATCGCA GCAACATATC CTGCAGTACC 16140 

TCCACCGAGA ACGACTAAAT CATATTGTTT CTCTGACATG TTCTTACTCC TAACTAATGA 16200 

50 

TATATATCCA TTGAAAATTT ATTAATACAT AGTTTTCATG TCCATTAATT ACCTATTTTA 16260 

CATGATTGTC TATTTAGTT7 GAATGCACAT AAATAAATCC ATAAATGAGT ATTCAACACA 16320 
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TAAATCAGTA ACACTTGCAC CTGAAATCAT TCGTGCAATT TCATCTACTT TATCATCGCT 16440 
AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 16500 

5 

ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAATAA CTTGTATATA 16560 
TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 16592 
(2) INFORMATION FOR SEQ ID NO: 54: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
is (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

pa 

CCAATACAAC GTAAAAAGAT TGCTTGTGTT ATTAATGAGT TAGATAAAAT AATTAAAGGA 60 
TTTAATAAGG AAAGAGACTA CATAAAATAT CAATGGGCTC CAAAATATAG CAAAGAnTTT 120 

TTTATACTTT TTATGAACAT TATGTACTCA AAAGATTTTT TAAAATATCG ATTTAATTTA 180 

25 

ACATTTCTTG ATTTATCTAT CTTATATGTA ATATCATCTC GAAAAAATGA GATACTAAAT 240 

TTAAAAGATT TGTTTGAAAG TATTAGATTT ATGTATCCTC AAATTGTTAG GTCAGTTAAT 300 

AGATTAAATA ATAAAGGTAT GCTAATCAAA GAACGATCCC TTGCAGATGA AAGGATTGTG 360 

30 

TTAATCAAAA TAAATAAAAT ACAATATAAC ACTATTAAAA GCATATTCAC AGATACTTCC 420 

AAGATTCTCA AACCAAGAAA ATTTTTCTTT TAAATTTAAA CAGATTTACC TCTTGATAAA 4 80 

36 ATAAATAAGC AATCATACTA CTTCTCAATT TAGTATAAAT AAAAATACAT AATTAACTTT 54 0 

CTTTTGTTTT TATATTATTT CAATACCCTA CTATATATCA CAACACATAA ATTAAGCATG 600 

ACAOTCATTC AATTTAGTTC ACCATTTCGT GTTCCAATTT TACTGAGTAT CATGCTTTTA 660 

40 ATGTTATAAA CCTAATGCTT TAATAAATCG TGTTAATTCT TCTCGCATAC TGTCATCTTT 720 

CAATGCATAT TCTATGGTAG TTTTAACGAA GCCTAATTTT TCTCCAACGT CATAACGTTC 780 

GCCTTCGAAG TCATATGCAT ACACTTGGTT ATCATTATTC ATACGTTCAA TCGCATCTGT 840 

45 

TAACTGAATT TCGTTACCTG CGCCTTCTTT TTGCGTTTTT AAATAATCGA AAATTTCAGG 900 

CGTTAATACA TAACGTCCCA TAATAGCTAG GTTTGATGGT GCCGTACCTT GTGCTGGCTT 960 

TTCAACAAAC TTTTTCACTT CATACTGACG TCCGTTTTTA GTTAATGGGT CAATAATTCC 1020 

50 

ATAACGATGA GTATCTGCTT CCGGAACTTC TTGGACACCT ATAACTGAGT GCCCTGTTTC 1080 

TTCATAAACG TCAATCAACT GTTTCACTGC TGGCACTTCA GATTCAACAA TATCGTCACC 1140 
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TAAACCTTTT TGTTCTTTCT GCCTTACATA 
AACTTTCTCT AGTAATTCAG ATTTACCTTT 
5 ACTATCAAAA TGATCTTCAA TCGCGCGTTT 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT 

AAGCATTTCC TTTGGCATCG CTTTAGTTGC 

10 

GGGAATGATT GCCTTTTTTA TTTTTTTCAA 

ATCTATGTAT CAACGTCATT TTAACACTAA 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT 

15 

ATGAAATAAA CCTGTCAATT TTGGATTGAT 
AATAACGCTA AACCTAAAAT GCTAAATAAT 

20 TCTTCTCCAC CTGTTTCAGG TAGTTCAGAT 

ACTGCTTTAA CCTTTTCATT GATTTCAATA 
GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA 

25 GGTACCTCTG GCGTTGGCGG TGTTGGTGTT 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT 

30 

AGTGTATCTT CTTCAAAGTC AACACTATTG 

TTATTTGTAT CTTCTTCAAT AATTTCAGTG 

CTGTCGAAGT CGATATCAAT GATATTACCA 

35 

GTATCTTCTT CGAATGATTG GTTACCATTA 
AAATCGATAT CTACGATATT GCCACCTTGT 
TCCTCAAATG ACTGATTACC GCTATTTTGG 

40 

TCCACGTGGC TATTTTCTTC GATTTCTTCA 
GTTCCTAAAC CAGAATGAGA AATATGATGA 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA 

GTATATTCTT TCGTATCTTC AATTGTTGTA 
TTTGTAGAAT CTTCGTCAAA TTCAACTAGG 

50 . GGGTTTGTAT CTTCTTCATA TTCAACAACA 
GATTCTTCAA AGTCTACATG AATAGAATCT 
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AAAAATATTC GCAAGTTCCG TTGAATACTG 1260 

TTCTTTTAAC ACCATTTCTA ATTCTTTTTG X320 

GTGGCGACCT GTCACTATAA TAATATCTTC . 1380 

ATATTGTATT GTGGGTTTAT CTAAGATAGG 1440 

TGGTAAAAAT CTAGTCCCTA AACCAGCAGC 1500 

AGTTAATGTG CTCCTTTTCC TAAGTATTAA 1560 

TTAGAACGCC TTCATAGTGT CATTGAGTAT 1620 

TTTAAAAAAC AGGCTTACTT CATATAATTT 1680 

TATGCTTTGT GATTCTTTTT ATTTCTGCGT 1740 

CCGCCGAACA ACATGCCGTT . GTTTGTTGAT 1800 

TTCTTAGATT GTGCTTTTTT AGTTGGTACC 1860 

ACAGGTGTTA CTACTTTACC TTGTTCCACT 1920 

GCAGGTGGTA TTGGTTTACC AGGTTCAGTT 1980 

TCCGGCTCGC TTGGTACTTC TGGTGTCGGT 2040 

TCTGGTGTCG GTGGCGTTGG TGGCACGATT 2100 

TGACCTTCAT TATGACCACT TACTTGTGGA 2160 

TGTCCACCGA ATTGATAATT TGGTTTATCT 2220 

TGCTTATTGA ATCCGTGAAT ATGTGGCACA 2280 

CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 2340 

TTTTGACCAT GAATTTGAGG TACACTATCG 2400 

TCATATTTCG GTTTATCTTC TTCTGTGTCT 2460 

CCACCTTCGT AACCTAATTC ACTCTTAATA 2520 

ATCACGCCAT AATTACCGTG ACCATTTTCA 2580 

TTGTTTTCAG TAATTTCCTC GATTGGTCCT 2640 

TCTACTAGTT CAATCAGATT ACTTTCAGTC 2700 

TGATCGCTAA CAGCACCAGT TACAATACCT 2760 

TTAGACTCAG TAGTAACCTG ACCACCACCT 2820 

TCAGCATGAT GTTTTGAATT TTCATGTGTC 2880 

TCTTCAGTTT CAATGGTACC TTCTGCATGA 2940 
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TCTTCGATTG TACCAGTCAA TTCATGCTTC TCCACTGGCG GCTCTGATTT AAATTCAAGT 3060 

TCGATAGGAG TACTATGTTC TATAATAGGT TCCTTTAGTT TATCTTTGCC GTCGCCTTGA 3120 

GCGTTATTAG AGTAAAATGC AACGCCATTT TTCCaAGTTA AATTACTTGT ATAATAATAG 3130 

TTATAATATC CAAAAAGGTG TGTTTGAAAT TCTAAGTTGC TAGCATTTGA ATCATAATAC 3240 

CCTTCATATT TTATTACATA ATTTTTACTT TGGTCTAAAT TATTAAAGTT TAAAGAATAA 3300 

CCACCATTAG TATCAAAATC TAAACTCATA TTATCAGTCA CATCTTCAAA TTTGCTGACA 3360 

TCATCAAGCT TTGCATAnTn AgctTTCAGC TAAATCGTCT GAACCAATGT GTTTATATAC 3420 

CTTAACTGTT GGATTATTAA CCCCTGGTTT ATTTCCTTTA GTTACTTGAC CAGTTACTGT 3480 

CACAGAGCTT AACGACTGGT TGTTAGGTTT CATGTACGCA AAATGACTAA ATTTCCCATC 3540 

TACTTTATTT AAAGTATCAA TTCGACCATT AGCTGTTACT CCCCAATTAT CTCTAACTCC 3600 

20 ACCTAAATAT TGAATATTAA ATATTTTGCT AACCGTAGTC TCACCCAATT TAACTTCAAC 3660 

ATTTTGGTTA CCTTTTTGCG TCACTGTTGT AGGATCAATA AATAGATTTA AAGATAATTC 3720 

AGCAGTTAAA TCTTTCTTTT CTTGTACATA TTCTTTAAAC GTATATCTAA CTTTTCTTTC 3780 

TCCAATTATT TCTCCTGTCG CCATAACTTG ACCATCTGTA CTTTTTATCT CCGGAACTTT 3840 

ACGCAGTGTT GAGATACCAT GAGTTTCAAC ATTATCGCTT AATGTGAAAT CAAAATAATC 3900 

TCCCGCCTTA ATTCCTTCTC CAAATTTCCA TTTATATTTC AAGGTTACTC TTTCTGCGTT 3960 

ATGAGGATTT ACAACATTCG TATCTTGTTT ATGTCCTACA ATTTCACTAC CTTCTTCTAC 4020 

TTCCACTTTA TTTGTTACAT CTGTACCTGT CGCTTTAGTT TCTTCCACTA CTTCTTTCTC 4080 

TGCAACTGCT GTAACGTCAt TGatCTTTTC ATTCTTGGTT TAATTTCTGA GACGTTACTT 414 0 

GGTTGAGCTA TGTCAACTTG AGTTCCTGTA GTTTCCTTAT CAGCAACTTT TTCCGATGGC 4200 

AAATCAACTC GCGAAgTTTC TACTTTTGGT GCTTGCAcAG TTTTCGGTGC TTCTTCTGTT 4260 

40 GTTACTTGTG TTGATTGTGA TGGTTGCTCA GTTGATGTCG CGCTGTATGA TTGTGTTTCA 4320 

TCTATTGTAT TAACGTTATT TGTAGTTGTT TGTGTTTCGC TTGCTTTACT TTCAGTAGCT 4380 

GAACTCCCAC TTTCCTCTAC TGTAGTATTG TTTTGTTCCG ATGCTGCAGC TTCTTTTTCT 4440 

TGTCCCATTC CAACAACGAT CATTGTTCCT AAGAATACTG AGGCCGCTCC CAATTTGTGT 4500 

TTTCTTATGC CGTATCTAAG ATTGCTTTTC ACTATAATAT TCTCCCTTAA ATGCAAAATT 4 560 

CATTTATTTT TAAAACTCAA TAAATGCAAT TCTATATTGT TCGGTTTTTA AAAGCAATGA 4620 

AAAAAAGCGA GTTAATAAAA AGTTAAGATT GTTGTTAACT TTATGTATAA TGAGTTTTTT 4680 

ATTATTTGAA ACTCACATAT ATATTGCATA CAAAGCTCTT GAACACCTTG ATATAACAGG 4740 
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TACTAAACCA 


TACATAATAA TCGCCTGTAC 


AATGCATCAT 


TAACAAGTCA 


CTGAAACGCC 


4860 


TTTCATTGTA 


TTAATAACGT 


CACTATAATT 


TTTATATCGT 


TCGGTTTTTG 


TTTGATTTTA 


4920 


ATGATTATTT 


ATACAAAAAC 


AGCCGTATTT 


CAAGCCGACA 


TTTTAAATTT 


AACTAAATTT 


4980 


GCATCTAGTT 


AATAATTGCA 


TTTATCAAAT 


TTGTCTTATT 


GATCCAATCT 


AATTTGTACT 


5040 


CACAAACTAG 


TTTAAAATTC 


TAACTTTATC 


TCTCAGTTCG 


TTATCAATCA 


TCAGACATAA 


5100 


ACCAATGAAG 


CAATCAGAAA 


ACACTCTAAT 


TTTCTATTAG 


AAATTTGATT 


TAATATAAAA 


5160 


AAACAGGCTT 


ACTTCATATA 


ATTTATGAAA 


TAAACCCGTC 


AATTTTTGTT 


TAATTATGCT 


5220 


TTGTGATTCT 


TTTTATTTCT 


GCGTAATAAT 


GCTAAACCTA 


GAATGCTGAA 


TAATCCGCCG 


5280 


AACAACATAC 


CTTTGTTTGT 


TGATTCTTCT 


CCACCTGTTT 


CAGGTAGTTC 


AGATTTCTTA 


5340 


GATTGTGGTT 


TTTTAGTTGG 


TGCCACTGCT 


TTAACCTTTT 


CATTGATTTC 


AATAACAGGT 


5400 


GTTACTACTT 


TACCTTGTTC 


CACTGGTTTA 


GAAGGCTTTT 


TAGGTTCTTC 


TTTGGCAGGT 


5460 


GGTACTGGTT 


TACCAGGTTC 


AGCTGGTACC 


TCTGGTGTTG 


GCGGTGTTGG 


AGTTTCTGGC 


5520 


TCACTCGGCA 


CTTCTGGTGT 


CGGTGGTGTT 


GGTGTTTCCG 


GCTCACTTGG 


TACTTCTGGT 


5S80 


GTTGGTGGCG 


TTGGTGTTTC 


CGGCTCACTT 


GGTACTTCTG 


GTGTCGGTGG 


CGTTGGTGGC 


5640 


ACGATTGGAG 


GTGTTGTATC 


TTCTTCAATC 


GTTTGTTGAC 


CTTCATTTTG 


GCCGCTTACT 


5700 


TTTGGAAGTG 


TATCTTCTTC 


AAAGTCAACA 


CTATTGTGTC 


CACCGAATTG 


ATAACTTGGT 


5760 


TTATCTTTAT 


TTGTATCTTC 


TTCAATAATT 


TCAGTGTGCT 


TATTGAATCC 


GTGAATATGT 


5820 


GGCACACTGT 


CGAAGTCGAT 


ATCAATGATG 


TTACCGCCAT 


GTTCATACTT 


AGGTTTGTCT 


5880 


TTTTCTGTAT 


CTTCCTCGAA 


TGACTGATTA 


CCTTTATTTT 


GACCATGAAT 


TTGAGGTACA 


5940 


CTATCAAAAT 


CGaTATCTAC 


GATATTGCCA 


CCTTGTTCAT 


ATTTAGGTTT 


GTCTTCTTCT 


6000 


GTGT£TTCCT 


CGAATGACTG 


GTTACCGCTA 


TTTTGGCCAC 


CTTCATAACC 


TAATTCACTC 


6060 


TTAATATCAA 


CGTGGCTATT 


TTCTTCGATT 


TCTTCAATCA 


CGTCATAATT 


CCCGTGACCA 


6120 


TTTTCAGTTC 


CTAAACCAGA 


ATGAGAAATA 


TGATGATTGT 


TTTTAGTAAT 


TTCCTCGACT 


6180 


GGTCCTTGTG 


CTTGACCATG 


CTCTTCAGGT 


AATTCATCCA 


CTAATTCAAT 


CAGATTACTT 


6240 


tCAGTTGTAT 


ATTCTTTCGT 


ATCTTCAACT 


GTTGTATGAT 


CGCTCACtGC 


GCCAGTTACA 


6300 


ATACCTTTTG 


TAGACTCTTC 


GTCAAATTCA 


ACTAAGTTAG 


ACTCAGTAGT 


AACCTGACCA 


6360 


CCACCTGGGT 


TTGTATCTTC 


TTCATATTCA 


ACAACATCAG 


CGTGATGTTT 


TGAATTTTCA 


6420* 


TGTGTAGATT 


CTTCAAAGTC 


AATTGGATTT 


GATTCCTCAG 


AGGACTCAGT 


GTATCCTCCA 


6480 


ACGTGACCTG 


ctTCGCTATC 


CACAGCAGTA 


TGGTAATCGA 


TATCAATAGC 


TGATGAATCC 


6540 
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TGGTAATCAA TGTCAAGAGT TGATGAATCA TATTCCTCTT CAACAGTAGT TACTAAATTC 6660 

TTATCATATT GACCTGTAAG AGTTTCTTTA ATTGTATCTT CTTTATATTC AAATTTATTA 6720 

5 

TTTTGAATAA TCGGACCATT TTTCTCATTT CCGTTCGCTT TATTACTGTA TAAAACTAAA 6780 

CCATTATCCC AAGTTAAGGT ATATCCTCTA TCATAATAAT ACTTATAAAG TTGCTCTGGA 6840 

TGTCCTACCA TTTGTGTTCT AAAATCAACT TCATCAGTAC CATTTAAATA CTCTCCATCA 6900 

70 

TAGTGAACAA CATAAGTTTT ATCTAGATTT TCTATATTCA ATGAATAGCT TCCATTATTT 6960 

TGTAAATTCA AATTCCCACT CATATTACTT GTGACTTCTT TAAATTTAGA AGTATCTGTC 7020 

, 5 GTATTTGCAT ATACACTCTT CGCTATGTCT TCATTATTAC CCAAGTATTC AAATATCCTA 7080 

ACTTTTGGTT GATTTCCATT CTGATTACTA CCTTTCATTA AAGTTCCAGT AACAGTCACA 7140 

CTTGTCGTTT TACCATTATT AGGTTTAATA AATGCAACAT GCGAAAATCT ATTATTCGCT 7200 

20 TTATTAAATG TCTCAATCGA TCCATTTAAA TTGGCATAAT AATTCCCAAT ACCATCTTTA 7260 

TATTTAACAT CTAATTCCTT TGAAGTTTGT TCTTCATTTA GTGTTGAAGT TATAGTTTGA 7320 

TTTCCATTAG TTTGTACAGT TTTAGGATCA ATAAATAAAT TAATTTCTAG TTCAGCCGTT 7380 

25 ACATCAACCT TATCTTCAAT ATCATTTGTA AATGTATATC TAATCTTTCC ACCTT CTAAA 7440 

ACTTCACCTG TCGCCATTAC GACTGAACCA TTTTTAATTT CTGGTACTTT TCTAGCAGTT 7500 

GATACGCCAT GCGTATTTAC ATTATTTGAT AAAGTAAAGT CAAAGTAGTC ACCTTGATGT 7560 

30 

AAACCATTCT CAAATTTCAA CTTATATTTT AGTACCGCTC GTTGTCCTGC ATGAGGTTCT 7620 

ACTTTATTTG TATTGTTATG CCCCTCAATA GAACCAATTT CTACTGTAAC TTTACTTGTT 7680 

ACATCTGTAC CCGTTTCCAC TTTCGCGTTA CTAGCTTCCT TAGCTTCCGC TACATCTGCT 7740 

35 

GATCTTGTCA CACGTGGCTT ACTTTCTGAT GCCGTTCTTG GCTGTGCCAC TTCAACTTGT 7800 

GTTTCTGCGA CTTGATTTTG TGTAGCCTTT TTAGGTGTTA AATCTACTTG TCTTTGATCT 7860 

40 CCGCTATTGT CTTGAGATTG TGTTGTTTCC TTAACTTGAG GTTTCGCTTC TTCCTTAACT 7920 

ACCTCTTCTT TAACTGTTTC TATATTTGCT GGTTGTGCAG TTTGTGGTGC TTGTACTGCT 7980 

TTTGGTGCTT CTTCAGTTGT TACTTGTGTT GCGTTTGACG GTTGTTCTGT TACTGTTGCG 8040 

45 TTATATGATT GAGTTTCTTC TATATGATTA ACGTTAGTTG CAGTTGTTTG TGTTTCACTT 8100 

GTTTTATTAT CAGTAGCTGA ATTCCCATTT TCTTCTACTG TAGTTGTCTT TTGTTCTGAT 8160 

GCTGCAGCTT CTTTGTCTTG TCCCATCCCA ACAACGATCA TTGTTCCTAA GAATACTGAT 8220 

50 

GCTGCTCCCA ATTTATGTTT TCTAATGCCG TACCTAAGAT TGTTTTTCAC TATAATATCT 8280 

CCCTTTAAAT GCAAAATTCA TTAATTTTTT AAACTTAATA AATGCAAGTC TATATTGTTC 8340 
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ATGTTAATTG ATAATTTTAT TATTTGAAAT ATACCTATAA ATTGTATTCA AGTCATCAGA 


8460 




AACCCTTGTC 


ACACAAGGCT 


TGTATTTTTT 


ATACTTATTT 


TTTAAATTAA ATTCATCATT 




5 


ATCTAATTTA AAACAATATA CTAAACGTTT CATAATTATC GCCTGTACAA TACGCACAAA 






AACATGTCTT 


GAAACGCCTT 


TCATTACTCT 


AAAATACCCA 


ATATACTTTT 


TATATCGTTC 


oca n 


10 


GGATTCTGAG 

wvxn x x v» x w»j 


TATTTCAGAC 


GATTTTCTGC 


ATAAAAATAA 


ACGTGTTTCA 


AGGCAATATA 


onnn 


TTGCAATTAC 


CTAAAAACAC 


GTTTACTTAA 


TATTTAGTTA 


AACAAATAAG 


CTAATGAATA 


o /eu 






GATACCTGAA 


ACGGAAATAA 


TCGTTTCTAA 


TAATGACCAT 


GTTAAGAATG 


ooZU 


1$ 


TTT^TTTT A P 


AGTTAAACCA 


AAATATTCTT 


TAAACATCCA 


AAATCCTGCG 


TCATTTACAT 


8880 




n a a p a AAA T 


CACACTACCT 


GCACCTATCG 


PAAGTACAAC 


TAATGCAACA 


TTTACATCTG 


8940 




/\ 1 \jr\ 1 lui nn 


TAATGGTAAG 


ACAATACCTG 


TAGTTGAAAT 


CGCAGCTACT 


GTAGCCGAAC 


9000 


20 




ACGTAGCACA 


GCTGCAACAA 


TCCATGCTAG 


TAAAATCGGA 


GACATCTCTG 


9060 






CATTTTAGCA ATTGTATTTC 


CGACACCGCC 


GTCAATTAAT 


ACTTGTTTAA 


9120 




a TnT & ppr p c 


ACCGCCAATA 


ATCAATAACA 


TCATTCCGAT 


TGGATAAATC 


GCATTCGTCA 


9180 


25 


PTY2ATTPPAT 


AATATGATTC 


ATCTTACGCT 


TTCTCATTAA 


TCCCATCGTA 


ACGATTGCAA 


924 0 




ATAATAPTfiP 


TATTAGCATG 


GCTGTCCCTG 


CTGTTCCTAT 


CATATAAATG 


ATAGATTCAA 


9300 






AGGTTTGTCA 


TGCCCAGTTA 


CAAGTTGCGT 


TATCGTAGAC 


ACTAACATTA 


9360 


30 


ATATGACTGG 


TAATGTTGCT 


GTTAATAAAC 


TCATACCAAA 


TCCTGGCATC 


TCTTGATCCG 


9420 




TAAATTCTTT 

x nn^ix x \« x x x 


TTGTGCACCT 


AACGCTGAAA 


TATCGCCTTC 


TCGTGTATAC 


GCAGACGGAA 


a a q n 
9480 


35 


TCATTTTTTG 

x ■ » x x x x x x 


TGCAcTTTGT 


TAAATATAGG 


CCCTGCAATG 


AGTGTAACTG 


GaATGGCAAT 




AATCATACCA 


TACAGTAATA 


CATCTCCAAC 


ATTTGCCTTT 


AATTCTTTTG 


CGATGACTAC 


7bUU 




CGGTCCTGGA 


TGTGGTGGTA AAAAGCCATG 


TGTCACTGAT 


AAAGCTGTTA 


CCATAGGTAG 


700 u 


40 


TCCTAGTTTT 


AACACTGAAA 


CATTTGCGCG 


TTTTGCTACT 


GTAAATACTA 


ATGGAATCAG 


QT*5 n 
j / d. u 




TAAGACTAAA 


CCTACTTCAA 


AGAACAATGC 


AATACCGACG 


ATAAATGCTG 


CAACAAGCAT 






TGCCCATTGT 


ACATGTTTTT 


GACCAAATTT 


TTGAATCAAC 


GTGTCTGCGA 


TTCGAGTTGC 




45 


ACCACCACCA 


TCAGCAAGCA ATTTCCCAAG 


TATGGCACCT 


AAACCGAATA 


TCAGTGCAAT 


9900 




GTGGCCGAGC 


GTACTGCCCA 


TTCCTTTCTC 


AATCGTCTCC 


ATAATTTTAG 


TCAATGGTAT 


9960 




ACCTAGCATT 


AACGCTGTAA TCATCGATGT 


GATAATTAAT 


GAAATAAATG 


TATTTAATTT 


10020 


50 


AAACCCAATA ATTAATACTA ATAAAATAAC GATACCTAAA ACAACACTGA TTAACGGCCA 


10080 




TATTTCGTTA 


AACATGACAT 


TCCCCTCTTT 


CTCTTTTCAA 


TAGAATGTAA 


CACCGTCGTC 


10140 
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GAGTGACGTA 
TGTTCATAAT 
TAAACAGTGA 
ACGATTGAAA 
CATGAACTTT 
TGACGCCATA 
TTCATTACTT 
GCAGCGCGAA 
GCATTTGCGT 
TCTGCACCTG 
AGACGTTTCG 
ACACCACCAT 
AATATTCTAC 
GTACCGATTG 
ACCCCATCAC 
TAACGTTCTT 
TTGGAAATAC 
ATCCCTGTTG 
ATGTATGTTT 
TTCATCCAAA 
TAAATCGCAT 
TCTGCCCAAG 
TGCATTTGCG 
ATAACATATT 
ACATCAACGT 
TTTTCATCAT 
TTCATGATAA 
CAACATCGTC 
CTGCATCAAT 



TTTATTGTGT 
TCTCTGTTAA 
CATTTTCTTC 
AATCTTCAAT 
CATAACTTTC 
CTTCACTTTT 
CAATAAGCGC 
TCATATGTTC 
TCCAAAGCGG 
GTTTAACACG 
CAGTTTCGAC 
TATTTACAGG 
CTTTGTAATC 
TGACAGCAAC 
TCGCACCAAT 
TCATACCTTT 
CCAGCAGTTC 
CGGAAGCCAT 
TAATATCTGC 
AAATCTTCGC 
TGCCATCATG 
TAATATTATT 
CACTAAATGA 
TAATAGTCAT 
TTGGTGTGTG 
ATAAGACTGA 
ATCCTTCTTT 
GAAATTTAAA 
AAACACTTGA 



TTTATTTTCA 
AGAACGACTT 
AATCGGCGTA 
GTCACCTACA 
AGGAACCACT 
CGCAAAACCA 
AAGATAGACG 
TTTTTTATGA 
CGCACGTTCT 
CTTTGCAATT 
TTCACTCGCT 
ACCTCCGATG 
AGTACGCGGT 
TTCTCCTTTA 
AACAAACGGT 
CAt CACATAC 
TAATGCCTCA 
TGAATAATCA 
AAACTTAGCA 
TAATGGCGAC 
CACTTCATTT 
TGTTAATCTT 
CACAAACTTA 
TAGTACTGCA 
TAAATCATAG 
CTTGGTACTC 
CTTTCATTTT 
TGAAACGCTT 
TGATTATGAT 



GCGATATGTT 
AAATTGATAA 
TGATTGTTTG 
GCTTTAAGTC 
AACTCTGTGT 
CCTGTTGCTT 
GTATACAAAT 
GATAAAGTTA 
CCTGCTAAAT 
TGAGTTAAGA 
AGCAACTCGT 
ACGTAGTGGT 
TTATCTATCA 
CCAACACTAT 
GTATCTTTAT 
GTTGTTGGAA 
ACATCCCAAT 
ATGATATATG 
GTACGTTGAA 
ATAGGATGAA 
ATTACTGTTG 
TGATGTTGCT 
ATGTCGTCTT 
TCAAATAATT 
CCTATTTGAT 
GTCGTTCCAA 
AATTCAACCA 
CTTTCAAAAT 
GTATGCGTTC 



GGCGTTGAAA 
AAATGGATAC 
TGGCACCGAC 
CGAGCACGCA 
CAAATATATC 
TTATCATCTT 
TGTAAAGAAC 
AACCGAAGAA 
AGGGATGGAA 
CATCATAAGG 
CGCGCAACCA 
CCTCTGTTAA 
CAGTACGAAT 
TGACACCTAA 
TAAGCCCCAT 
CTAATTCCGG 
CTAATGTTTC 
TATCAAATAA 
ATACATCTTG 
TCGGTGTGCC 
CATATTTTGC 
GATCCATCGC 
TATTAACTTT 
CATCTGGGTT 
GTTTCATGAT 
TGTCGACACC 
AAATCCTTCA 
TTGACTGTCG 
AAAATCTTGC 



ATCTGCAATT 
GATCTCTTGG 
CATCGATGAA 
GGCACCTAAG 
TGACATCATT 
AGGTGTTTCA 
ACCTTCTAAT 
TGAACCTCTT 
TATTAAACCA 
ATCAACACCG 
TCTCAATACG 
GACATAACAA 
CGCCCCAGAT 
ATTAGAAAGG 
TAATGTTGCA 
CAACATTTCC 
TAAATTAAAC 
ATGATAGAAA 
CCATTCATGT 
TGTTCGCTGG 
AGCGCGGTTA 
AATCAAGCTA 
GGATTCTCTC 
TTCTTCTGAG 
AAAAGTTCCA 
AATCATATAT 
ATATCTTTAC 
TATTGTTCCA 
GGGTTCTGTT 



10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11260 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
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AAAATGAGTT TAAATATTGA 
CATGCTTCGT AAATGATTCT 
GTTTCTTCAT TTCTTTTACG 
ATGACTCTAA CATCAGTCGC 
ACACATGTGC ACCCATTCTT 
CATCTCGAAT TGGCGAACGA 
ACGTACCTTC AGCTATGTGA 
CTCCAGTTGT CATACCTTCC 
CTTCATTACA CGACATACTT 
CGCTTTCTTT ACGAGCACTA 
TGGTCACTAT CACACGAATG 
CTGTACTCAT ATGCGCTTTA 
ATTCGAATCG ACGTGTTGTC 
AGAATATATG CTGGCAGTTA 
CACCAATGCC ATAACCAATC 
TCAAAATTGC ACTTATAATG 
AAATGTTACG ACGAATACTT 
TTAAGTGTGT GATTGGAGAC 
TGTTTAATAC CGCTTGTTGC 
GTAAAGTATT GAGCGTCTTC 
ACGC&CAACC TAAATCTTTA 
TATACACACG ACGCTTTCCT 
AATATTGAAT CGTTCGTGTT 
TAGACATAGA TTCCACCTCC 
ACAATTTCCA CATTTTAAAG 
CCATGTTGAT TTACAAACTC 
TATTTCAGAA TGAATTTGTT 
TTTCAGTAAA TCTCGATACT 
TAAACTTGcC CACATATCCA 



TGATTAGATG CTTTGATTAA TGTTTCATGA AATTCAAAGT 12060 

GCATCCTCAA ATTTTACTGC CACTTTCATC ATTTCAAGTT 12120 

ATAGGTAGTC GCTCTTGATT TTTAACTCTT GAAAATGCAA 12180 

AAATCATACA TTTCTTTCTT TTCTTGTTCC CCAAACGGCA 12240 

TCTAATTGGA TGAGTTGATT TTGTTGCAAT AATTTAAATG 12300 

CTCACATTAA ATTGCTTTGC CATTTGATTT TCAGTGAGTA 12360 

CCATTCACAA TGCCTAAGCG TAATTCTGCC GCGATACCTT 12420 

AACCATTTCT CTGGATATCC ATACATCATC AAAGTCACTC 124 BO 

GTATACAAGT ATGTTAATAT AGTTATTATG AGTTTGCAAG 12540 

AAATAGTGAC CACCCCTTTT CGATTTAAAT TTAAAGGAAA 12600 

ATTTAATTGT TATGTTGTAT GTGGGATATT TCTAATTGTT 12660 

GGTACTTCAA TGCAATAATG CGTTTCATGA CAGTTTGGAC 12720 

GCTGTATGTT TCGCTTTGAT AACTGCCCAC AAAGATGGTG 12780 

GGACATAAAT AGGCAACCTT TTGTTGGTAA TAAAAAGTAA 1284 0 

ATAAATGGTA AAGCAATTAA AAACGGCCAT TTATTTTTCA 12900 

CTAGAATATT GAATTATTCC TATAATACCA GCACTAATCC 12960 

TTCATTTCAG CTGATTTACT CATGACATGC TCTATGTCTT 13020 

GTCGACGCTT CATTTACGTA ATATTGAACA TTTTTAATTT 13 080 

TGTTTAACTT GTTGGTTAAT TTCTTGTTGT TTCATAGTTA 13140 

AAAGTACCTT CACCTTTTAG CAACATATCT ATATCGCTTA 13200 

AGCAATAAGA TTAACTCTAA TGTTTGTCGC TGTTGTTCTG 13260 

TCTGTAAATC CTTGTGGTTT CAAAATACCT TTGCGATCAT 13320 

GTCACATTGC ATAATTTTGC GAGTTGTCCA GTCGAATAGT 13380 

TATAATTACC ATAGTTGATG ACCCGACGTC ACGAGCAAGT 13440 

AAATTTATTA TACTAGGCGT CTTATTTTTA TGATTTCGTA 13500 

ACTCAAACTA AGTAACACAC CTACTAAACA TCTACTCTGT 13560 

GTAATTTATC TTCAACTTCA GTAATCTCTG TCGCACATTC 13620 

TTTCCGTCTC TGCATTGTTT TTATAACGTA TTTTATGTTC 13680 

TACCTATCGT TCTAATTTGA ATTTCAACAG GCAATACCTC 13740 
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(2) INFORMATION FOR SEQ ID NO: 55: 

<i) SEQUENCE CHARACTERISTICS: 
s (A) LENGTH: 1059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



w 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 





GGATAAGTTC 


AGGTAAATTC 


ATTTCTTTTT 


CAATTTTGAT 


TTTCATTGTT 


TCCGCCCTTT 


60 


15 


TAAAATAAAG 


TTAGTTGCTT 


CTGTTCCTCA 


TATTCCAAAT 


CACTTTGCTT 


TATATATGTT 


120 




TCAAGCTCTT 


CCGCTGTATC 


AAATGTCTTT 


TTCACACCTT 


GCCAACCTGG 


CACGATATGA 


180 




CCGTGAAAGT 


AATAAGTGCC 


ATTTACTACA 


TGGATATGTG 


CCACTCGTTC 


GTTATCCTGA 


240 


20 


TACAGATATC 


TCTTAGATCC 


AAAGAATTGA 


TTTAGGTATT 


CTTTACGCGC 


GCTATCTGTC 


300 




ATGGTCATCA 


CTCCTTTTAA 


CAATTAGGCA 


GACCAAACGA 


CATGCATTCG 


TCGTATAGCT 


360 




CTTCATTACT 


TATGCTTGCC 


TTATAGTTTT 


CAATCACATT 


GCTAACTTCT 


TTATGACTCA 


420 


25 


TTGCTTTAAC 


TTGTTCGTCT 


GTATATTTTT 


CGCAGTCTTC 


TAATTCCAGT 


TGCTCCTGTA 


480 




ATGACATCAC 


AT ATT CAACT 


TGTCTTTGGG 


TTGCCATCGT 


TAACCCTCCC 


ACAAGTCAAA 


540 




AGCTCTTTGG 


ACGTAAAACT 


TCGCCTTTGC 


TAAATCCTCA 


TGACCATTCT 


TTAACGGTGC 


600 


30 


TCTAGACATG 


TATTTGATTG 


CATTACCTAT 


TGCGAATGCf AGTTGAGGTG GATACTGTGC 


660 




CGTAACCTGT 


TCGATAAAAT 


CTATAATTTC 


AATGTCGCCG 


TATGTGTAGT 


GCGCTGGTTG 


720 


35 


CTTAACATTG 


TCTTGCGCTT 


CGTTCATATC 


TACTTTTCTG 


TTACTGATTA 


CGCTCATTAT 


780 


. GCTTCACTCC 


ATTTCTTGAA 


CATTTGGTTA 


TAAGTGACAT 


CGAACCAGTA 


CGGATCACGT 


840 




gaatSttttt 


GTGGCGTTCC 


ATCATAAAGC 


CATGGTCTTA 


ATCTTCTCTT 


TCTTTCCTGT 


900 


40 


TCATATTCCG 


CTCTCACATT 


TCGTTGGTAT 


CGGTTCAAAA 


TCGCTTTTTT 


TCTGATTTTT 


960 




TCTCTCCCTT 


TTTCTTCATC 


TTTnATtTGA 


CTCTnCATAT 


ATTCAACTTC 


TTCTGTAGAT 


1020 




nTTGAGTCCT 


TTCTTCCACA 


CAATAATTCA 


nCGCCGCGC 






1059 



45 (2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3024 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
50 (D), TOPOLOGY: linear 
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GAAGTAAAAG AAGAATTAAA TTTAACATTA ACAATGGATG AAATTGAATA TGTCGGGACA 


60 




ATTGTAGGTC 


CTGCATATCC ACAACAGGAT ATGTTAACTG AGTTAAATGG ATTTCGCGCA 


120 


5 


TTAACCAAAA TCGATTGGGA AAACGTAACT ATCAATAATG AAATTACGGA TATACGCTGG 


180 




ATTGATAAAG ATAATGATGC GTTGATTGCG CCTGCTGTCA AAGTTTGGAT TGAAACTTAT 


240 


10 


GGTGGTAAAC 


ATGACAAATA ATGACACCAT CATGTTACGA CATTATGTCC CACAAGATTA 


300 


TTCGATGTTA GAAGCTTTTC AATTAAGTGA AAGTGATTTG AAGTTTGTTA AAACGCCAGA 


360 




GGAAAATATT 


ACAGCTGCAA TGTCTGATAA TGAAAGGTAT CCCATCGTTG TAATGGATGG 


420 


15 


CAGGCAATGT GTGGCCTTTT TTACATTACA TCGTGGAAAA GGGGTCGCAC CATTTAGCGA 


480 




TAACCAAGAT 


GCAGTATTTT TCAGGTCATT TAGTGTTGAT CAACGTTATC GTAATAGAGG 


540 




AATAGGTAAA 


GTGGTAATGG AAAAATTGGC GTCATTTATC ACTTCAACAT TTCAGGATAT 


600 


20 


TAATGAGATT 


GTGTTAACGG TTAATACTGA CAATCCACAT GCCATGGCAC TTTATCGCCA 


660 




ACAAGGATAT 


CAATATATGG GAGATAGTAT GTTCGTCGGA AGACCTGTTC ATATTATGGC 


720 




GTTAACTATA 


AAATAAATTA AATTTAAAAG CATCTTTACT CATCGTCGAC CACAACAATT 


780 


25 


AATGATGAAT 


AAAGGTGCTT TTTGTTATAG ATCATCGGAC AATTTACTAT AGTAAAAAGC 


840 




GACCTAGTGA 


ACAATTGACA TATATCCACA GGTCGCTTAA CTTAAGTTAT ATTGCTAGTT 


900 




GCGATTAATT 


GATAGACTCA TCATTTTTGC GCTGTCGAGA TGGTCTTTTT ATTAAAAATG 


960 


30 


CCGTAATCCA 


AGCCGTAATC GGAATACTGA TTGCAACGGC AATACCGCCT AAAATAATAG 


1020 




AAATAAATTC 


TTGGGCAAAT ATTTTCGAGT TTATAATATG ACCAAATGAA TATTTAAGTT 


1080 


35 


TGAAAAACCA AATAAATAAA GCAAGTTGGC CACCAAAAAA GGCAAGGTAA ATCGTGTTCG 


1140 


CAGATGTCGC 


TAAAATTTCT CTACCAACAC GCATGCCAGA TTGGAATAAT TCGTATTGCG 


1200 




TAACGTTgGA TTCACTTGAT GCAATTCATA AATGGGTGAA CTAATGGTAA TTGTTAAATC 


1260 


40 


TATCACAGCT 


GCAATAACAG CAAGAATAAT AGTGAACACC ATAAATTGAA CCATATCAAT 


1320 




GCCAATATTC 


ATTGAATACA CATATGTTTC ATCTTGTTGT TCGGTTGaAA AGCCTTGTAG 


1380 




ATGACCGAAG 


TAGACCGATA AATAAATGAG TGTAATCAAC AATATTGTTG TAACGATAgT 


1440 


45 


GCtGgATAAA TGCaGCTTGT GTTTTAACAT TGTAACTATT GAGTACGAAT AAATTACAAG 


1500 




CGCCAATAAT 


AATGCAGAAA AAGAATGTGA CGACATAAAT CGGTACGCCA AAAATAATCA 


1560 




ATACAATACT 


AATAATTAAA ATAGCGAAAT TTAAAAATAG GGTTAAATAA GAGATGAATC 


1620 


SO 


CCTTTTTACC TCCGAAAATT ATCATCAGAA AGAGGAGCAA TAACGCCAAT ATAAATACAG 


1680 




CATTCATTGT TTCGCCCTCC TTAATGTTTC AAATATTTCC ATAAACAATA TTGTGATAGG 


1740 
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CATCGAAATA GTATAAGTCA CTGTATTGGC ATTTTTTAAA AAGATTAAAA ACATAGGTAG I860 

TGCACCGGAT AAATATGAGA ATAATAAGAT GTTAGTCATT GTTCCCATAA TATCTTGGCC 1920 

GATGTTTCGC CCAGCAAGCG CCCATCTCCT CATTGAAATG TGTGGCGTAC GCTGTAAAAT 1980 

TTCATGCATA CCACTAGCAA TTGTAATTGC AACATCCATA ATAGCGCCAA GTGAACCTAT 2040 

TAACACTGAG GCTAGGAAGA TATCTTTCGG TGGTAATGAT AAAAAGTTCA TCGTTTCATA 2100 

TTTAATGCCT TTACCATCTG T CAT AT AT AT GATTAATTCT GTTAAACCTA TACTCAAAAA 2160 

AGTTCCGATA ATTGTACTGG CTATGGTAAT GAGTGTACGC ATATGCCAGC CTGTAACGAG 2220 

CAATAAAGTG AGTATTGTTG AACAGATCAT GGCAATGGTC ATGAGTAAGA ATAAATTAAT 2280 

ATTGCTATGT TGAATATGAA TGTAAATTGC GATTAATATG GCAATAGAAT TCAAGATTAA 2340 

CGATAAAATC GATTGCAGTC CGACTTTGCG ACCAACCAAT AATACAGTTA ATAAGAACAA 2400 

ACCAGTGATG ATAACCGTTA AGGTATCACG CTTCTTTTCT ATAATATAAG CATCACTCGG 2460 

CTTGTTAGAA ATATGTAATA ATACTTTTTC GTGTGTGCGA AATGCCTCAG AATCTGCTTG 2520 

CGATTTGACG TACTGATGAT TAATCGTCGT CGTTTCTCCA GCAAATTGAC CATTTAATAT 2580 

25 TTTGACTTTT AATTGATTTT TATATTTAAT ATCACGATTA TTTTGTGCAT CTTTTGTAGG 2640 

TGTCGAAGAA ACATGTTTGA CATCTATAAT TTGACCAATT GGTTTGTTGT AAAAGTTCTC 2700 

ATTATTGAAT GTAAATAAAA TAGCACCAAT GAATGCGATG CAGAACAAAC CTAAAATTAT x 2760 

30 ATTAAATGGC TTTGTAAATA AATTTCTATA TTTCAAAAAC AAAACCCCAA TTCTATGAAT 2 820 

GAATTAATAT GGTGATTATA CGCCCTTAAT TTTTTATTTT CAAAGATATT ACTGCTAAGT 2880 

GTAAAACGAA AATCATCATT GATAGCATCG AATTACTTAA TGGAATGTAG ACGTTTTAGT 2940 

CATTAATTGC TGAATAAGTG TTAATAATAT GCCAATATCA CTCTTTGTAT AAGGCTCCTT 3000 

TGT^ATAGCA CATATCGTTC TTTTTAATTC AGTATGATCT AATTTTATAT CTATCCATGA 3060 

TTTAGATTCT GGTAAATGTA TATTTTGTGA TGAAATGATG TAACCTTCTT TTTGACGAAG 3120 

GAGATACTGC GCAAGTGGTT GGCTACTGAT TGTGTATACA TCTGATTTAG TAATCTTGCG 3180 

CAATTGTTTT TTTACAGTTT CGGCAAATGG TGCCAAGCAA TAAATATGAC TATGCTCAAA 3240 

CTGAATTAAT GGTGGGTGTG TCGCCATCGT AATTGGATCG TCTGAAGGCG CATATAAATG 3300 

ATAGTGCTCT TGGAATAAAG GTAGCATATG TAATTGTTTG TGTTTACGTA TTTCTGGTGT 3360 

AAGTTCCGTG AAACCAATGT CTATATTCCC ATTTAATACG CTATTTATAA TTGTGTCATG 3420 

TTCTAATAAG CTCGGTATGA CATGTGTATC ATTTTGTAAA TGAAACGTTT GGATAAGTGG 3480 

TAGTAACATG TGGGATACGT CACTCTCATC ATAGCCAATG TAGATACTTT TATTTTTAGT 3540 
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TTCATTAAAT AATAATTTCC CTTCAGATGT GAGCGTAATA TTGCGTCCTT GCTTTTTAAA 3660 

TAAAGACACA TTAAGTTCTT GTTCTAATAA TGTAATTTGA CGGCTTATCG CTGATTGAGC 3720 

AATGTTTAGT TCAAGTGCTG TTTCGGAGAT ATGTTCTCTT TTAGCGACCT CGATAAAATA 3780 

TCTTAATTGT TTAATTTCCA TAGCGATATA GGCACCTCCA AAAATGAGTG TTTTGTAACT 3840 

ATTATAGCAA TATTATTGAT AAATGTTCTA TTTTTTAGAT GAATATCTTC TATTTTATAT 3900 

ATTGAACAGA TAAATTTTTT AGATTATAGT AATTATCATT AATAACTAAT ATCAGAATAT 3960 

TCTAAAAAAG GGGTGTGCAT CATGCACAAT GAGAAATTAA TTAAAGGCTT ATATGACTAT 4020 

CGTGAGGAAC ATGATGCGTG TGGTATTGGT TTTTATGCGA ATATGGATAA TAAAAGGTCT 4080 

CACGACATCA TTGATAAATC GCTTGAAATG TTGCGACGCT TAGATCACAG GGGCGGGGTC 4140 

GGCGCAGATG GCATCACTGG TGATGGCGCA GGTATTATGA CTGAAATACC TTTTGCATTT 4200 

TTCAAACAAC ATGTAACGGA CTTTGATATC CCAGGTGAAG GTGAATATGC CGTGGGGTTA 4260 

TTTTTTTCCA AAGAACGCAT TTTAGGTTCT GAACATGAAG TAGTTTTTAA AAAATATTTT 4320 

GAAGGCGAAG GGTTATCAAT TCTTGGTTAT CGTAATGTAC CAGTTAATAA AGATGCCATT 4380 

25 GCTAAACATG TAGCAGATAC GATGCCAGTC ATTCAACAAG TGTTTATTGA TATTAGGGAC 4440 

ATTGAAGATG TTGAAAAGCG TTTGTTTTTA GCGAGAAAAC AATTAGAGTT CTATTCGACT 4500 

CAGTGCGATT TAGAATTGTA TTTTACGAGC TTATCACGCA AAACAATTGT ATATAAAGGT 4560 

30 TGGTTACGAT CAGACCAAAT TAAAAAACTA TATACAGATT TATCGGATGA TTTATATCAA 4 620 

TCAAAGCTAG GGTTAGTGCA TTCGAGATTT AGTACGAATA CATTCCCGAG TTGGAAAAGG 4 680 

GCACATCCTA ACCGTATGTT AATGCATAAT GGTGAGATTA ACACGATTAA AGGTAATGTA 4740 

AACTGGATGC GAGCACGCCA ACATAAATTA ATCGAAACAT TATTTGGCGA GGATCAACAT 4 800 

AAAGTGTTTC AAATTGTCGA TGAGGATGGT AGTGACTCTG CCATTGTAGA TAATGCGCTA 4860 

GAGTTCTTAT CGTTAGCCAT. GGAGCCAGAA AAGGCAGCGA TGTTACTCAT ACCTGAACCT 4 920 

TGGTTATATA ATGAAGCGAA TGATGCAAAT GTACGTGCGT TTTATGAATT TTATAGTTAT 49B0 

TTAATGGAAC CGTGGGATGG TCCTACAATG ATTTCGTTCT GTAACGGTGA CAAACTTGGC 5040 

GCGCTTACAG ATAGAAATGG ATTACGTCCA GGTCGTTATA CGATTACTAA AGATAACTTT 5100 

ATTGTCTTTT CATCTGAAGT GGGTGTTGTG GACGTACCTG AAAGTAATGT TGCTTTTAAA 5160 

GGTCAATTGA ATCCTGGAAA GTTATTGCTT GTTGATTTTA AACAGAATAA AGTCATTGAA 5220 

SO AATAATGATT TAAAAGGTGC GATTGCTGGA GAATTACCAT ATAAAGCGTG GATTGATAAC 5280 

CATAAAGTTG ACTTTGATTT TGAAAATATA CAATATCAAG ATTCGCAATG GAAAGATGAG 5340 
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CAGGAACTTG TAGAAGGTAA GAAGGATCCT ATCGGTGCAA TGGGATATGA TGCGCCAATT 5460 

GCAGTGTTGA ACGAGCGACC AGAATCACTA TTTAATTACT TTAAACAGCT GTTTGCACAA 5520 

5 

GTTACGAATC CACCAATTGA TGCGTATGGT GAAAAAATCG TAACGAGTGA ACTTTCTTAT 5580 

TTAGGTGGCG AAGGTAACTT ACTAGCACCT GACGAAACGG TTTTAGATCG TATTCAATTG 5640 

AAAAGGCCGG TATTGAATGA ATCACACTTA GCAGCGATTG ATCAGGAACA TTTTAAATTA 5700 

10 

ACTTATTTAT CAACGGTATA TGAAGGGGAT TTGGAAGATG CGTTAGAAGC ATTAGGCCGA 5760 

GAAGCAGTGA ATGCTGTAAA GCAAGGCGCT CAAATTCTAG TGTTAGATGA TAGTGGATTA 5820 

J5 GTTGATAGCA ATGGCTTTGC AATGCCGATG TTACTCGCAA TAAGTCATGT GCATCAATTA 5880 

CTTATTAAAG CAGATTTACG TATGTCTACA AGTTTAGTCG CTAAATCTGG TGAGACACGA 5940 

GAAGTGCATC ATGTTGCTTG TTTACTCGCA TATGGCGCGA ATGCAATTGT GCCATACCTA 6000 

20 GCGCAACGTA CAGTTGAACA ACTGACATTG ACAGAAGGGT TACAAGGCAC CGTTGTCGAT 6060 

AATGTTAAGA CATATACGGA TGTATTGTCA GAAGGTGTCA TTAAAGTAAT GGCTAAGATG 6120 

GGAATTTCGA CAGTGCAAAG TTATCAAGGG GCACAAATAT TTGAAGCGAT . TGGCTTGTCT 6180 

25 CATGATGTGA TTGATCGTTA TTTTACTGGG ACACAGTCTA AGTTATCTGG TATTTCGATT 6240 

GATCAAATTG ATGCTGAAAA TAAAGCACGT CAACAAAGTG ATGATAATTA TCTTGCATCA 6300 

GGTAGTACAT TCCAATGGAG ACAACAAGGT CAACATCATG CTTTTAATCC GGAATCTATT 63 60 

30 

TTCTTATTGC AGCACGCATG TAAAGAAAAT GACTATGCGC AATTTAAAGC ATACTCTGAA 6420 

GCGGTGAACA AAAATAGAAC AGATCACATT AGACATTTAC TTGAATTTAA AGCATGTACA 6480 

CCGATTGACA TCGACCAAGT TGAACCGGTA AGTGACATTG TCAAAOGCTT TAATACAGGG 6540 

35 

GCGATGAGTT ATGGATCGAT TTCAGCGGAA GCACATGAAA CGTTAGCACA AGCCATGAAC 6600 

CAAUAGGTG GAAAGAGTAA TAGTGGTGAA GGTGGCGAAG ATGCAAAACG TTATGAAGTA 6660 

CAAGTTGATG GAAGCAACAA AGTAAGTGCG ATTAAACAAG TTGCTTCTGG GCGTTTTGGT 6720 

40 

GTAACTAGTG ATTATTTACA ACATGCCAAA GAAATTCAAA TTAAAGTTGC GCAAGGTGCA 6780 

AAGCCTGGTG AAGGTGGTCA ATTACCTGGT ACTAAGGTAT ATCCGTGGAT TGCGAAGACA 6840 

45 AGAGGGTCAA CGCCAGGTAT CGGTCTGATT TCACCACCGC CACATCATGA TATTTATTCA 6900 

ATAGAAGATT TAGCGCAACT GATACATGAT TTGAAAAATG CGAATAAAGA TGCAGATATC 6960 

GCGGTAAAAT TAGTTTCGAA AACAGGTGTT GGTACCATTG CATCTGGGGT GGCAAAAGCA 7020 

SO TTTGCAGATA AAATTGTCAT CAGTGGTTAC GATGGTGGTA CAGGGGCTTC ACCCAAAACG 7080 

AGTATTCAGC ATGCCGGTGT TCCTTGGGAG ATTGGTTTAG CAGAAACACA TCAAACATTA 7140 
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AAAGATGTAG CGTACGCATG TGCGCTTGGA GCGGAAGAAT TTGGATTTGC AACTGCACCA 
TTAGTGGTGT TGGGCTGTAT TATGATGCGT GTATGCCATA AAGATACATG TCCAGTAGGA 
GTTGCAACTC AAAACAAAGA TTTACGTGCT TTATATAGAG GTAAAGCACA TCATGTTGTT 
AATTTTATGC ATTTTATTGC ACAAGAATTA AGAGAAATTT TAGCATCTTT AGGTTTGAAA 
CGTGTAGAAG ACTTAGTTGG AAGAACTGAT TTATTACAAC GATCATCAAC ATTAAAAGCG 
AATAGCAAAG CGGCTAGTAT TGATGTTGAA AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 
ACAAAAGAAA TTCAACAAAA TCATAATCTT GAGCATGGAT TTGATTTAAC AAATTTATAT 
GAAGTAACGA AGCCATATAT TGCTGAAGGG CGTCGCTATA CAGGTAGCTT TACAGTAAAT 
AATGAACAAC GTGATGTAGG GGTTATTACA GGTAGTGAGA TTTCGAAACA ATATGGAGAA 
GCAGGACTTC CTGAAAATAC AATTAATGTT TATACGAATG GTCATGCTGG TCAAAGTCTT 
GCAGCATATG CACCGAAAGG CTTAATGATT CATCATACTG GAGATGCGAA TGACTATGTT 
GGTAAAGGAT TATCTGGTGG TACGGTCATT GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 
GAAATTATTG CTGGTAACGT CTCATTCTAT GGTGCGACAA GTGGTAAGGC ATTTATTAAC 
GGTAGTGCAG GAGAAAGATT CTGTATTAGA AATAGTGGTG TAGATGTTGT CGTTGAAGGT 
ATCGGCGACC ATGGATTAGA GTATATGACT GGTGGACATG TCATTAATTT AGGTGATGTA 
GGTAAGAACT TCGGTCAAGG TATGAGTGGT GGTATTGCTT ACGTTATCCC GTCTGATGTA 
GAAGCTTTTG TTGAAAATAA TCAACTAGAT ACGCTTTCGT TTACAAAGAT TAAACACCAA 
GAAGAAAAAG CATTCATTAA GCAAATGCTG GAAGAACATG TGTCACACAC GAATAGTACG 
AGAGCGATTC ATGTGTTAAA ACATTTTGAT CGCATTGAAG ATGTCGTCGT TAAAGTTATT 
CCTAAAGATT ATCAATTAAT GATGCAAAAA ATTCATTTGC ACAAATCATT ACATGACAAT 
GAAGATGAAG CGATGTTAGC TGCATTTTAC GATGACAGTA AAACAATCGA TGCTAAACAT 
AAACCAGCCG TTGTGTATTA AGGAAAGGGG GAGATACGAT GGGTGAATTT AAAGGATTTA 
TGAAGTATGA CAAACAGTAC TTAGGTGAAT TATCACTGGT AGACCGTTTG AAGCATCATA 
AAGCATATCA ACAACGATTT ACTAAAGAAG ATGCCTCTAT CCAAGGTGCA CGATGTATGG 
ATTGTGGAAC GCCGTTTTGT CAAACCGGAC AACAGTATGG TAGGGAAACA ATAGGTTGTC 
CAATTGGAAA CTACATTCCT GAATGGAACG ACTTAGTGTA TCATCAAGAT TTTAAAACTG 
CTTATGAACG CTTAAGCGAA ACAAATAACT TTCCTGACTT TACAGGGCGT GTATGTCCTG 
CACCATGCGA AAGTGCTTGT GTGATGAAGA TTAATAGAGA ATCGATTGCG ATTAAAGGTA 
TTGAACGCAC AATTATTGAT GAAGCTTTTG AAAATGGTTG GGTAGCGCCG AAAGTTCCGA 
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CTGAAGAACT TAATCTACTA GGATATCAAG TAACTATTTA TGAACGTGCT AGAGAATCAG 9060 

GCGGTTTATT AATGTATGGT ATTCCGAATA TGAAACTTGA TAAAGATGTG GTTCGACGTC 9120 

GTATTAAGTT AATGGAAGAA GCGGGCATTA CTTTCATTAA TGGTGTTGAA GTCGGTGTTG 9180 

ATATTGATAA AGCAACGTTA GAATCTGAGT ATGATGCCAT TATATTATGT ACTGGTGCAC 9240 

AAAAAGGTAG AGATTTACCT TTAGAAGGAC GCATGGGTGA TGGTATACAT TTCGCTATGG 9300 

ATTATTTAAC TGAACAAACG CAGTTGTTAA ATGGAGAAAT TGATGATATA ACAATAACTG 9360 

CAAAAGATAA GAATGTCATT ATCATTGGTG CTGGTGATAC AGGGGCAGAC TGTGTAGCGA 9420 

CAGCATTAAG AGAAAATTGT AAATCGATTG TTCAATTTAA TAAATATACG AAATTGCCAG 9480 

AAGCAATTAC ATTTACAGAA AATGCATCAT GGCCTTTAGC AATGCCGGTG TTTAAAATGG 9540 

ACTATGCGCA CCAAGAGTAC GAAGCTAAGT TTGGTAAGGA ACCACGTGCA TATGGTGTTC 9600 

AAACAATGCG TTACGATGTT GACGATAAAG GACACATACG TGGTTTGTAT ACTCAAATTT 9660 

TAGAGCAAGG CGAAAATGGT ATGGTCATGA AAGAAGGACC TGAAAGATTT TGGCCTGCTG 9720 

ACCTTGTATT ATTATCAATC GGCTTCGAAG GTACAGAACC AACAGTACCG AATGCTTTTA 9780 

ACATTAAAAC GGATAGAAAT CGAATCGTGG CGGATGATAC AAACTATCAA ACTAATAATG 9840 

AAAAGGTATT TGCTGCTGGA GATGCTAGAC GTGGTCAAAG TTTAGTTGTA TGGGCAATTA 9900 

AAGAAGGTAG AGGCGTAGCG AAAGCAGTAG ATCAGTATTT AGCTAGTAAA GTTTGTGTAT 9960 

AATCTTTGTA TGGAAATGGT GGTTACGTTG ACGTTGTGAC ATGCTGAATC GAGTTTGAAA 10020 

AAATCTAGTA TCTATCAACG TCACATGCCA TCTTTGTAAC CTAAAAACAA AGGTTTGTAA 10080 

GACAACAAAT AGATTAATTA TAAGTAGTGA TTTTTTACAT TCGTTTATAG GTCAACTGTA 10140 

GTGGAAGACA ATGATTTGTG GTAATCATGT AATGCTTAAA AACAATATTG ACTTTTACAG 10200 

AACGTTCATA TATGATAAAT ATTGTGTTTA GGAGGAATAC CCAAGTCCGG CTGAAGGGAT 10260 

CGGTCTTGAA AACCGACAGG GGCTTAACGG CTCGCGGGGG TTCGAATCCC TCTTCCTCCG 10320 

CCATCAATAT TTATATTAAA TTCTATATAT AATGAAGGTA AGTGCTCAAA TTTTGAGTAT 10380 

TTACCTTTTT TATTTGTCTT TGAATGGCTC GTAATTTTTG ATAATAGAAA TGATAAGGCA 10440 

TTGAGATTGG AAGGGCATTT GGCTTGTGCA ATATACATAG CTAAATGTCT TTTTTGTTTT 10500 

GTGAAATATG ATGGATGGCT TGTGTGGACA AGTTTGCTAT TTATAGATAT GCATTTTTCA 10560 

ATTTAGGAGT TGGCCATGCA TCTACACTTT ATAATGGTGA GAGCGTGGTG AGGTATTGTT 10620 

AATAACGCAA TTGTAGCGAG GAGTTATTGC TACATATGTC GTTATGGCTC ATTGATTTTC 10680 

TGAAATGGCT ACCCCAGATA ATTGTGACAA AATAAAAATA TTTTGTTGAA AGCCTTTACA 10740 
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TAAAAAGAGA 


AGATGTAAAA GCCATCGTAA 


CCGCTATTGG 


GGGAAAAGAA AATCTTGAAG 


10860 




CTGCAACGCA 


TTGTGTAACA 


CGATTACGTT 


TAGTGCTGAA 


GGATGAAAGT 


AAAGTTGATA 


10920 


5 


AAGACGCATT 


AAGTAATAAC 


GCGTTGGTCA 


AGGGGCAGTT 


TAAAGCAGAC 


CATCAATATC 


10980 




AAATTGTCAT 


TGGTCCAGGA 


ACAGTCGATG 


AAGTGTATAA 


GCAGTTTATT GATGAAACAG 


11040 




GTGCTCAAGA 


AGCTTCGAAA 


GATGAAGCGA 


AACAAGCAGC 


TGCACAAAAA 


GGGAATCCAG 


11100 


10 


TACAACGTTT 


GATCAAATTG 


TtGGGGGATA 


TTTTTATACC 


AATATTACCT 


GCGATTGTGA 


11160 




CAGCTGGTTT 


GTTAATGGGA 


ATCAATAATT 


TACTTACAAT 


GAAAGGTTTA 


TTTGGTCCAA 


11220 




AAGCACTTAT 


TGAGATGTAT 


CCACAAATTG 


CTGATATTTC 


AAACATCATT AATGTGATTG 


11280 


15 


CGAGTACGGC 


ATTTATTTTC 


TTACCAGCAT 


TAATTGGTTG 


GAGTAGTATG 


CGTGTATTTG 


11340 




GTGGTAGTCC 


GATTCTAGGC 


ATAGTCTTAG 


GTTTGATTTT 


AATGCATCCG 


CAATTAGTAT 


11400 


20 


CTCAGTATGA 


TTTGGCAAAA 


GGGAATATTC 


CGACGTGGAA 


CTTATTTGGC 


TTAGAGATTA 


11460 


AGCAGTTGAA 


TTACCAAGGT 


CAAGTGTTGC 


CAGTtTTAAT 


TGCAGCTTAC 


GTTCTAGCTA 


11520 




AAATTGAAAA 


AGGATTAAAT 


AAAGTCGTTC 


ACGATTCGAT 


AAAAATGTTG 


GTCGTTGGAC 


11580 


25 


CCGTAGCGCT 


TTTAGTTACT 


GGATTTTTAG 


CATTTATTAT 


CATTGGACCA 


GTTGCGTTAT 


11640 


TGaTTGGTAC 


AGGTATTACA 


TCTGGTGTTA 


CATTTATATT 


CCAACATGCA 


GGATGGCTTG 


11700 




GCGGAGCAAT 


ATATGGATTG 


TTATATGCAC 


CACTTGTAAT 


TACAGGACTA 


CACCATATGT 


11760 


30 


TTTTAGCAGT 


AGATTTCCAA 


TTGATGGGTA 


GCAGCTTAGG 


CGGTACGTAT 


TTATGGCCAA 


11820 




TTGTTGCGAT 


TTCCAATATT 


TGTCAGGGCT 


CTGCAGCATT 


TGGAGCATGG 


TTTGTCTATA 


11880 




AACGTCGTAA 


AATGGTTAAA 


GAAGAAGGCT 


TGGCATTAAC 


ATCTTGTATT 


TCTGGTATGT 


11940 


35 


TAGGTGTTAC 


TGAACCAGCC 


ATGTTCGGTG 


TGAACTTACC 


TCTGAAATAT 


CCATTTATCG 


12000 




CTGCGATATC 


AACGTCTTGT 


GTATTGGGGG 


CAATCGTTGG 


TATGAATAAC 


GTACTTGGAA 


12060 




AAGTTGGTGT 


TGGTGGCGTG 


CCAGCATTCA 


TTTCAATTCA 


AAAAGAATTT 


TGGCCAGTAT 


12120 


40 


ATCTTATTGT 


GACAGCTATT 


GCTATTGTTG 


TACCATGTAT 


ACTAACAATT 


GTGATGTCTC 


12180 




ATTTTAGTAA 


ACAAAAAGCG 


AAAGAAATTG 


TTGAAGATTA 


ATAAAATAAA 


AAAGGGGCGT 


12240 




TCGTTATTTG 


GACGTCCTTT 


ATTACGTTAT 


AAGGTGGTAA 


TTGTGTGTCG 


AAAGAAATAG 


12300 


45 


ATTGGAGAAA 


ATCCGTTGTA 


TATCAAATTT ATCCTAAGTC 


GTTTAATGAT 


ACGACGGGGA 


12360 




ATGGTATAGG 


AGATATCAAT 


GGAATTATAG 


AAAAATTGGA 


TTATATCAAG 


TTATTGGGTG 


12420 


50 


TTGATTATAT 


TTGGTTAACA 


CCAGTGTATG 


AATCACCGAT 


GAATGATAAT 


GGCTATGATA 


12480 


TCAGCAATTA 


TTTAGAAATC 


aATGAAGACT 


TTGGAACGAT 


GGATGATTTT 


GaAAAGTTAA 


12540 
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CGACGGAGCA TGaATGGTTT AAAGAAGCCC GTAAATCTAA AGATAACCCy TATAGAGATT 12660 

ATTACTTTTT CAGATCATCT GAAGACGGGC CGCCAACAAA TTGGCATTCT AAATTCGGTG 12720 

GTAATGCA7G GAAGTATGAT TCTGAGACAG ATGAATATTA TTTACATTTA TTTGATGTCA 12780 

GTCAAGCTGA TTTAAATTGG GATAATCCGG AAGTACGTCA ATCGTTATAT CGCATAGTCA 12840 

ATCATTGGAT AGACTTCGGC GTTGATGGTT TTCGATTTGA TGTCATTAAC TTAATTTCTA 12900 

AAGGTGAATT TAAGGACTCT GACAAAATAG GTAAAGAATT TTATACGGAT GGTCCTAGAG 12960 

TGCATGAGTT TCTGCATGAA TTAAATCGTC AAACGTTTGG TAACACTGAC ATGATGACTA 13020 

TAGGAGAAAT GTCTTCGACG ACGATTGAAA ATTGTATTAA GTATACACAA CCAGAACGCC 13080 

AAGAATTGAA TAGTGTTTTT AATTTTCATC ATCTAAAGGT TGATTATGTT GATGGTGAAA 13140 

AGTGGACAAA TGCGAgcTTG nATTTTCATA AGTTAAAGGA AATTCTGATG CAATGGCAAC 13200 

GAGGTATTTA TGACGGTGGC GGATGGAACG CGATTTTCTG GTGTAATCAT GATCAGCCAC 13260 

GGGTAGTGTC TAGATTTGGT GATGATACGT CGGAAGAGAT GAGGATACAA AGTGCTAAAA 13320 

TGTTAGCTAT CGCACTGCAT ATGTTGCAAG GGACGCCATA TATTTACCAA GGTGAAGAAA 13380 

TTGGTATGAC GGACCCACAT TTTACATCAA TAGCACAATA TCGTGATGTT GAATCGATTA 13440 

ATGCCTACCA TCAGTTGTTA AGTGAAGGGC ATGCTGAAGC GGATGTGTTA GCGATTTTAG 13500 

GACAGAAGTC ACGAGACAAT TCGAGAACGC CTATGCAATG GAGTGATGAT GTTAATGCTG 13560 

GATTTACAGC TGGTAAnCCT TGGATTGATA TTTCGGAAAA TTATCATCAG GTCAACGTTA 13620 

GACAAGCACT TCAGAATAAA GAGTCTATTT TCTATACGTA TCAAAAATTA ATACAATTAA 13680 

GACATACGCA TGATATTATT ACGTATGGAG ACATTGTGCC ACGTTTTATG GATCATGATC 13740 

ATTTATTTGT TTATGAACGT CATTATAAGA ATCAACAATG GCTAGTAATT GCGAATTTCT 13800 

CAGCATCGGC TGTTGATTTG CCAGAAGGAT TGGCTAGAGA AGGTTGTGTT GTGATTCAAA 13860 

CAGGCACAGT GGAAAATAAT ACGATAAGCG GGTTTGGTGC AATTGTAATC GAAACAAACG 13 920 

CGTAAAATAA ATTGAGTGGA TGCGTTTATA TGGCGAAACA AAAAAAGTTT ATGAAGATTT 13 980 

ATGAGGCGTT GAAAGAAGAT ATATTAAACG GGCAGATTCA ATATGGTGAA CAAATTCCGT 14040 

CTGAACATGA TTTGGTGCAA TTGTACCAGT CATCTCGAGA GACCGTGCGT AAGGCATTAG 14100 

ATTTGTTGGC ATTAGACGGC ATGATTCAAA AGATTCATGG TAAAGGGTCA CTTGTCATTT 14160 

ATCAGGAGGT TACAGAGTTT CCATTTTCTG AACTTGTTAG TTTTAAAGAA ATGCAAGAAG 14220 

AAATGGGCGT CGCATATTTA ACTGAAGTTG TTGTGAATGA GGTTGTTGAA GCGCATGAAG 14280 

TTCCAGAAGT TCAACATGCT TTAAACATCA ATTCTAGTGA ATCACTCATT CATATTGTTA 14340 
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TTGTTTCAGA TATAGGTAAT GATGTTGCGA GTGATTCTAT TTATGATTAT TTGGAAAAGG 14460 

TATTAAATCT TAATATTAGT TATTCAAGTA AGTCTATTAC TTTTGAACCG TTTGATGAAC 14520 

AAGCATATCA ATTGTTTGGT GATGTATCGG TGGCTTATTC AGCAACAGTT CGAAGTATTG 14580 

TGTATTTAGA AAATACAATG CCGTTTCAAT ATAATATTTC AAAACATCTT GCAAATGAAT 14640 

TTAAATTTAA TGACTTCTCA AGACGTCGTA TAAAGTAAAC AATGATATAA ATGATTTATA 14700 

CTTGCAATTA ACTATTAAAA TATAGTAATA TATATCTTGC CGTGCTAGGT GGGGAGGTAG 14760 

CGGTTCCCTG TACTCGAAAT CCGCTTTATG CGAGGCTTAA TTCCTTTGTT GAGGCCGTAT 14820 

TTTTGCGAAG TCTGCCCAAA GCACGTAGTG TTTGAAGATT TCGGTCCTAT GCAATATGAA 14880 

CCCATGAACC ATGTCAGGTC CTGACGGAAG CAGCATTAAG TGGATGATCA TATGTGCCGT 14 940 

AGGgTAGCCG AGATTTAGCT AACGACTTTG GTTACGTTCG TGAATTACGT TCGATGCTTA 15000 

GGTGCACGGT TTTTTATTTT TTAAATATTA AACCGATTAT TAAGAGTTGA AAATATATAA 15060 

TTATAGAAGC TACTTTCTTG AAGACAATTC AGCGTATTAT ACGTGGAACA TGTTTGTGGG 15120 

AAGTAGCTTT TTTATATGTG AAGTTTGATT CAAGTGAACT CGATGTGCAG TTTGAATGAT 15180 

TTTTGTGTCA ATGAAAAGTA AGAAGTTATA ATTTGATGAT AAAGAAATGA TGGTGAAATG 1524 0 

AGGGGGAGTA TCTTACAATA GAATTATTAA TGAGATACGT TATGATTATT GACAATCAAA 15300 

TGCCTACGGA GGACATATGC AAATATATTT AAGTACTTTA ACAGAGTTAG ATTATGATAA 15360 

ATCTTTAAAT AGTATTGAAG AAAGTTTTGA TGATAATCCT GAAACGAGTT GGCAAGCACG 15420 

TGCGAAAGTA AAACATTTAA GAAAATCTCC TTGCTATAAT TTTGAATTAG AAGTAATAGC 15480 

GAAAAATGAA AATAACGATG TCGTTGGACA CGTTTTATTA ATTGAAGTAG AAATTAATAG 15540 

TGATGATAAG ACGTATTATG GTTTGGCGAT TGCCTCTTTA TCAGTTCATC CTGAATTACG 15600 

TGGACAAAAA TTAGGTCGTG GCTTGGTTCA AGCAGTAGAA GAGCGTGCCA AAGCACAAGA 15660 

GTATAGTACG GTTGTTGTAG ACCATTGTTT TGACTACTTT GAAAAGTTGG GTTATCAAAA 15720 

TGCTGCTGAG CATGACATTA AATTAGAATC TGGTGATGCA CCGTTACTTG TAAAATATTT 15780 

ATGGGATAAT TTGACGGATG CACCACACGG AATCGTAAAA TTTCCAGAAC ATTTTTATTA 15840 

ATTGTTCAAT TAAGAAGTAA AGGTATTATC ATGCTATAAT GAGAGGTAAT TGTTTATGGA 15900 

GGTGCTAACT TGAATTATCA AGCCTTATAT CGTATGTACA GACCCCAAAG TTTCGAGGAT 15960 

GTCGTCGGAC AAGAACATGT CACGAAGACA TTGCGCAATG CGATTTCGAA AGAAAAACAG 16020 

TCGCATGCTT ATATTTTTAG TGGTCCGAGA GGTACGGGGA AAACGAGTAT TGCCAAAGTG 16080 

TTTGCTAAAG CAATCAACTG TCTAAATAGC ACTGATGGAG AACCTTGTAA TGAATGTCAT 16140 
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AATAATGGCG TTGATGAAAT AAGAAATATT 
TCGAAATATA AAGTTTATAT TATAGATGAG 
5 GCCCTTTTAA AGACGTTAGA AGAACCTCCA 

GAACCACATA AAATCCCTCC AACAATCATT 
ATTAGCCTAG ATCAAATTGT TGAACGTTTA 

10 

TGTGAAGATG AAGCCTTGGC ATTTAtcgCT 
TTAAGTATTA TGGATCAGGC TATTGCATTT 
TTGAATGTCA CAGGTAGCGT ACATGATGAA 

15 

CAAGGTGACG TACAAGCATC TTTTAAAAAA 
GTGAATCGCC TAATAAATGa TATGATTTAT 
TCTGAGAAAG ATACTGAGTA TCGAGCACTG 

20 

ATGATTGATC TTATTAATGA TACATTAGTG 
CATTTTGAAG TGTTGTTAGT AAAATTAGCT 

25 GCGAATGTAG CTGAACCAGC ACAAATTGCT 

CGTATGGAAC AGTTAGAGCA AGAACTAAAA 
CCTGTTCAAA AATCTTCGAA AAAGCCTGCG 

30 TCAATGCAAC AAATTGCAAA AGTGCTAGAT 

AAAGATCATT GGCAAGAAGT GATTGATCAT 
AGTTTATTGC AAAATTCGGA ACCTGTGGCG 

35 GAGGAAGAGA TCCATTGTGA AATCGTCAAT 

AGTGTTGTAT GTAATATCGT TAATAAAAAC 
TGGCAAAGAG TTCGAACGGA ATATTTACAA 

40 

AAGCAACAAG CACAACAAAC AGATATTGCT 
ACTGTACATG TGATAGATGA AGAGTGATAC 
AAAGAAACAT CATTTTATTG ATAAATATTT 

45 

GCGGTGGCGG AAACATGCAA CAAATGATGA 
CTCAAGAACA AGAAAAACTT AAAGAAGAGC 
TTGCAGTTAC TGTAACTGGT CATAAAGAAG 

SO 

TAGACCCAGA CGATATTGAA ATGCTACAAG 



AGAGACAAAG TTAAATATGC ACCAAGTGAA 16260 

GTGCACATGC TAACAACAGG TGCTTTTAAT 16320 

GCACACGCTA TTTTTATATT GGCAACGACA 16380 

TCTAGGGCAC AACGTTTTGA TTTTAAAGCA 15440 

AAATTTGTAG CAGATGCACA ACAAATTGAA 16500 

AAAGCGTCTG AAGGGGGTAT GCGTGATGCA 16560 

GGTGATGGTA CGTTAACATT GCAAGATGCG 16620 

GCGTTGGATC ACTTGTTTGA TGATATTGTA 16680 

TACCATCAGT TTATAACAGA AGGTAAAGAA 16740 

TTTGTCaGAG ATACGATTAT GAATAAAACA 16800 

ATGAACTTAG AATTAGATAT GTTATATCAA 16860 

TCGATTCGTT TTAGTGTGAA TCAAAACGTT 16920 

GAGCAGATTA AGGGTCAACC ACAAGTGATT 16980 

TCATCGCCAA ACACAGATGT ATTGTTGCAA 17040 

ACACTAAAAG CACAAGGAGT GAGTGTCGCT 17100 

AGAGGCATAC AAAAATCTAA AAATGCATTT 17160 

AAAGCGAATA AGGCAGATAT CAAATTGTTG 17220 

GCCAAAAATA ATGATAAAAA ATCACTCGTT 172 80 

GCAAGTGAAG ATCACGTACT TGTGAAATTT 17340 

AAAGACGACG AGAAACGTAG TAGTATAGAA 17400 

GTTAAAGTTG TTGGTGTACC ATCAGATCAA 17460 

AATCGTAAAA ACGAAGGCGA TGATATGCCA 17520 

CAAAAAGCAA AAGATCTTTT CGGTGAAGAA 17580 

ATGACAAGCG ATATAATCGT ATGTATAATG 17640 

ATTGATTTTC AAGGAGGAAA TGGAATATGC 17700 

AACAAATGCA AAAAATGCAA AAGAAAATGG 17760 

GTATTGTAGG AACAGCTGGC GGTGGCATGG 17820 

TTGTCGACGT TGAAATCAAA GAAGAAGCTG 17880 

ACTTAGTGTT AGCAGCTACT AATGAAGCGA 17940 
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TCCCTGGaAT GTGATCATAG ATGCATTATC CAGAACCTAT ATCAAAACTT ATTGATAGCT IB 060 

TTATGAAATT GCCAGGCATT GGTCCAAAGA CAGCCCAACG TCTGGCTTTT CATACCTTAG 18120 

ATATGAAAGA AGACGATGTT GTTCAGTTTG CCAAAGCATT AGTAGATGTT AAGAGAGAAT 18180 

TAACATATTG TAGCGTATGT GGTCACATTA CTGAAAATGA TCCATGTTAT ATTTGTGAAG 18240 

ATAAGCAAAG AGATCGTTCA GTTATTTGTG TTGTGGAAGA TGACAAAGAT GTCATAGCTA 18300 

TGGAAAAAAT GAGAGAATAC AAAGGTTTAT ATCACGTTTT ACATGGGTCT ATTTCGCCTA 18360 

TGGATGGCAT TGGACCAGAA GATATTAATA TTCCTTCATT GATTGAACGC TTGAAAAACG 18420 

ATGAAGTTAG CGAATTAATC TTAGCTATGA ACCCGAACTT AGAGGGGGAA TCTACAGCCA 18480 

TGTATATTTC TAGATTAGTT AAGCCTATAG GTATCAAAGT GACGAGATTA GCACAAGGGT 18540 

TATCGGTAGG TGGCGATTTA GAGTATGCTG ACGAAGTAAC ATTATCTAAA GCAATCGCAG 18600 

GTAGAACAGA AATGTAATkT CTTCTATTAA ACATTTTTGA TTTTAATACT ATAGTAAGAA 18660 

AAGTCACAGT GTAATCATTG TGGCTTTTTT TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

GCGGTGTGGC GGTGGTATGG TTTACCTAGT TTTACTGAGG GATGGGTAAT CTTTAGGAAG 18780 

CAAGCCGTTG GTTGTGATTT GTTACTTCTA ATAGTAATGA TGTGAATTGG ATTATCGAAT 1884 0 

TAGATCTATG GTTATGGTGT GTTGGTGCTA TTAATTTGAT AAATGCGGTT AATGACTATG 18900 

CAAATGAAAT TCTTTTGTAA TTGAAATGAT AGATGCTGGC TTAGTAAGTT GTACTTCTTT 18960 

GGTCTAAAGC TTATTAAATC AGCCTGTATA GCGGTGTTTT GAGAGATTAT TTAAAACTTG 19020 

TAAATTTATT TTTAATTTCT GGTAAAAAAA TAACGTTCTG TTTTGCGTTT TTTTTGATTG 19080 

ATATGGTTAG AGAAAAATCT GTTTCTTGTT CTAAAAAACG TACTATTTAT AAGTGGGGAT 1914 0 

TTTTTAAGTT CGATTTTTAG GATAAGGGCG TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

ACTGTTGTTA AGCAGTTTGA AAGCCTGTAT AGTATTTATT TGTTGAGGCA AACAAAACAA 19260 

CTCAACTTAA GAAATAACTT GAATTACTAA CGAAAATTAA TTTTAAAAAG TTATTGACTT 19320 

AAATGTTAAT AAAATGTATA ATTAATTCTT GTCGGTAAGA AAAATGAACA TTGAAAACTG 193 80 

AATGACAATA TGTCAACGTT AATTCCAAAA AACGTAACTA TAAGTTACAA ACATTATTTA 19440 

GTATTTATGA GCTAATCAAA CATCATAATT TTTATGGAGA GTTTGATCCT GGCTCAGGAT 19500 

GAACGCTGGC GGCGTGCCTA ATACATGCAA GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CTGATGTTAG CGGCGGACGG GTGAGTAACA CGTGGATAAC CTACCTATAA GACTGGGATA 19620 

ACTTCGGGAA ACCGkAGCTA ATACCGGATA ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

AGACGGTCTT GCTGTCACTT ATAGATGGAT CCGCGCTGCA TTAGCTAGTT GGTAAGGTAA 19740 
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GAGACACGGT CCAGACTCCT ACGGGAGGCA 
gCtGaCGGAG CAACGCCGCG TGAGTGATGA 

5 GGGAAGAACA TATGTGTAAG TAACTGTGCA 

GGCTAACTAC GTGCCAGCAG CCGCGGTAAT 
■ TGGGCGTAAA GCGCGCGTAG GCGGTTTTTT 

10 GTGGAGGGTC ATTGGAAACT GGAAAACTTG 

GTAGCGGTGA AATGCGCAGA GATATGGAGG 
TGTAACTGAC GCTGATGTGC GAAAgCGTGG 

15 

CCACGCOGTA AACGATGAGT GCTAAGTGTT 
AACGCATTAA GCACTCCGCC TGGGGAGTAC 
GGGGACCCGC ACAAGCGGTG GAGCATGTGG 

20 

CAAATCTTGA CATCCTTTGA CAACTCTAGA 
GACAGGTGGT GCATGGTTGT CGTCAGCTCG 
CGAGCGCAAC CCTTAAGCTT AGTTGCCATC 

25 

GTGACAAACC GGAGGAAGGT GGGGATGACG 
TACACACGTG CTACAATGGA CAATACAAAG 
CATAAAGTTG TTCTCAGTTC GGATTGTAGT 

30 

CTAGTAATCG TAGATCAGCA TGCTACGGTG 
CGTCACACCA CGAGAGTTTG TAACACCCGA 

55 CGTCGAAGGT GGGACAAATG ATTGGGGTGA 

GCGQCTGGAT CACCTCCTTT CTAAGGATAT 
ATAACGTGAC ATATTGTATT CAGTTTTGAA 

40 TAAAGTGATA TTGCTTATGA AAATAAAGCA 

TACATTGAAA ACTAGATAAG TAAGTAAAAT 
AAAGAGTTTT AAATAAGCTT GAATTCATAA 

45 CACAAGATTA ATAACGCGTT TAAATCTTTT 

TGACTTATAA AAAT GGTGGA AACATAGATT 
GGCACTAGAA GCCGATGAAG GACGTTACTA 

SO 

AGCTTTGATC CAGAGATTTC CGAATGGGGA 



GCAGTAGGGA ATCTTCCGCA ATGGGCGAAA 19860 
AGGTCTTCGG ATCGTAAAAC TCTGTTATTA . 19920 

CATCTTGACG GTACCTAATC AGAAAGCCAC 19980 

ACGTAGGTGG CAAGCGTTAT CCGGAATTAT 20040 

AAGTCTGATG TGAAAGCCCA CGGCTCAACC 20100 

AGTGCAGAAG AGGAAAGTGG AATTCCATGT 20160 

AACACCAGTG GCGAAGGCGA CTTTCTGGTC 20220 

GGATCAAACA GGATTAGATA CCCTGGTAGT 20280 

AGGGGGTTTC CGCCCCTTAG TGCTGCAGCT 20340 

GACCGCAAGt TGAAACTCAA AGGAATTGAC 20400 

TTTAATTCGA AGCAACGCGA AGAACCTTAC 20460 

GATAGAGCCT TCCCCTTCGG GGGACAAAGT 20520 

TGTCGTGAGA TGTTGGGTTA AGTCCCGCAA 20580 

ATTAAGTTGG GCACTCTAAG TTGACTGCCG 20640 

TCAAATCATC ATGCCCCTTA TGATTTGGGC 20700 

GGCAGCGAAA CCGCGAGGTC AAGCAAATCC 20760 

CTGCAACTCG ACTACATGAA GCTGGAATCG 20820 

AATACGTTCC CGGGTCTTGT ACACACCGCC 20880 

AGCCGGTGGA GTAACCTTTT AGGAGCTAGC 20940 

AGTCGTAACA AGGTAGCCGT ATCGGAAGGT 21000 

ATTCGGAACA TCTTCTTCAG AAGATGCGGA 21060 

TGTTTATTTA ACATTCAAAT ATTTTTTGGT 21120 

GTATGCGAGC GCTTGACTAA AAAGAAATTG 21180 

ATAGATTTTA CCAAGCAAAA CCGAGTGAAT 21240 

GAAATAATCG CTAGTGTTCG AAAGAACACT 21300 

TATAAAAGAA CGTAACTTCA TGTTAACGTT 21360 

AAGTTATTAA GGGCGCACGG TGGATGCCTT 21420 

ACGACGATAT GCTTTGGGGA GCTGTAAGTA 21480 

AACCCAGCAT GAGTTATGTC ATGTTATCGA 21540 
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GAGGAAGAGA AAGAAAATTC GATTCCCTTA 

ACCAACAAGC TTGCTTGTTG GGGTTGTAGG 
5 TTAGACGAAT CATCTGGAAA GATGAATCAA 

TGTCTCTCTT GAGTGGATCC TGAGTACGAC 
- AGGACCATCT CCTAAGGCTA AATACTCTCT 
10 GAAAGGTGAA AAGCACCCCG GAAGGGGAGT 

GTAGTCAGAG CCCGTTAATG GGTGATGGCG 

GATTTGATGC AAGGTTAAGC AGTAAATGTG 

15 

CGTTTAGTAT TTGGTCGTAG ACCCGAAACC 

CAGGTAACAC TGAATGGAGG ACCGAACCGA 

GGGTAGCGGA GAAATTCCAA TCGAACCTGG 

20 

GGGCTAGCCT CAAGTGATGA TTATTGGAGG 
CGGGTTACCG AATTCAGACA AACTCCGAAT 
TGGGTGATAA GGTCCGTGTT CGAAAGGGAA 

25 

ATATATGTTA AGTGGAAAAG GATGTGGCGT 
GCAGCCATCA TTTAAAGAGT GCGTAATAGC 
GTACCGGGGC TAAACATATT ACCGAAGCTG 

30 

CGTTCTAAGG GCGTTGAAGC ATGATCGTAA 
CCGGTGTGAG TAGCGAAAGA CGGGTGAGAA 

3S AGGAAGGCTC GTCCGCTCTG GGTTAGTCGG 
TGGATAACAG GTTGATATTC CTGTACCACC 
tAGGATAGGC GAAgcGTGcG ATTGGATTGC 

40 AAATCCGGTA CTCGTTAAGG CTGAGCTGTG 
TTGATTTCAC ACTGCCGAGA AAAGCCTCTA 
GACACAGGTA GTCAAGATGA GAATTCTAAG 

45 GGCAAAATGA CCCCGTAACT TCGGGAGAAG 
GCCGCAGTGA ATAGGCCCAA GCGACTGTTT 
AGGTGATGTA TagGGcTGAC GCCTGCCCGG 

SO 

CTGCGAAgCT ACGAATCGAA GCCCCAGTAA 
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GTAGCGGCGA GCGAAACGGG AAGAGCCCAA 21660 

ACACTCTATA CGGAGTTACA AAGGACGACA 21720 

AGAAGGTAAT AATCCTGTAG TCGAAAATGT 21780 

GGAGCACGTG AAATTCCGTC GGAATCTGGG 21840 

AGTGACCGAT AGTGAACCAG TACCGTGAGG 21900 

GAAATAGAAC CTGAAACCGT GTGCTTACAA 21960 

TGCCTTTTGT AGAATGAACC GGCGAGTTAC 22020 

GAGCCGTAGC GAAAGCGAGT CTGAATAGGG 22080 

AGGTGATCTA CCCTTGGTCA GGTTGAAGTT 22140 

CTTACGTTGA AAAGTGAGCG GATGAACTGA 22200 

AGATAGCTGG TTCTCTCCGA AATAGCTTTA 22260 

TAGAGCACTG TTTGGACGAG GGGCCCCTCT 22320 

GCCAATTAAT TTAACTTGGG AGTCAGAACA 22380 

ACAGCCCAGA CCACCAGCTA AGGTCCCAAA 22440 

TGCCCAGACA ACTAGGATGT TGGCTTAGAA 22500 

TCACTAGTCG AGTGACACTG CGCCGAAAAT 22560 

TGGATTGTCC TTTGGaCAAT GG tAGGAGAG 22620 

GGACATGTGG AGCGCTTAGA AGTGAGAATG 226 80 

TCCCGTCCAC CGATTGACTA AGGTTTCCAG 22740 

GTCCTAAGCT GAGGCCGACA GcGTAGGCGA 22800 

TATAATCGTT TTAATCGATG GGGGGACGCA 22860 

ACGTCTAAGC AGTAAGGCTG AGTATTAGGC 22920 

ATGGGGAGAA GACATTGTGT CTTCGAGTCG 22980 

GATAGAAAAT AGGTGCCCGT ACCGCAAACC 23040 

GTGAGCGAGC GAACTCTCGT TAAGGAACTC 23100 

GGGTGCTCTT TAGGGTTAAC GCCCAGAAGA 23160 

ATCAAAAACA CAGGTCTCTG CTAAACCGTA 23220 

TGCTGGAAGG TTAAGAGGAG TGGTTAGcTT 23280 

ACGGCGGCCG TAACTATAAC GGTCCTAAGG 23340 
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TGTCTCAACG AGAGACTCGG TGAAATCATA GTACCTGTGA AGATGCAGGT TACCCGCGAC 23460 

AGGACGGAAA GACCCCGTGG AGCTTTACTG TAGCCTGATA TTGAAATTCG GCACAGCTTG 23 520 

TACAGGATAG GTAGGAGCCT TTGAAACGTG AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 2 3 580 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC CCGCACCACT TATCGTGGTG GGAGACAGTG 23640 

TCAGGCGGGC AGTTTGACTG GGGCGGTCGC CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 23700 

TTCCCTCAGA ATGGTTGGAA ATCATTCATA GAGTGTAAAG GCATAAGGGA GCTTGACTGC 23760 

GAGACCTACA AGTCGAGCAG GGTCGAAAGA CGGACTTAGT GATCCGGTGG TTCCGCATGG 23820 

AAGGGCCATC GCTCAACGGA TAAAAGCTAC CCCGGGGATA ACAGGCTTAT CTCCCCCAAG 23880 

AGTTCACATC GACGGGGAGG TTTGGCACCT CGATGTCGGC TCATCGCATC CTGGGGCTGT 23940 

AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC ATTAAAGCGG TACGCGAGCT GGGTTCAGAA 24000 

CGTCGTGAGA CAGTTCGGTC CCTATCCGTC GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 24060 

CTTAGTACGA GAGGACCGGG ATGGACATAC CTCTGGTGTA CCAGTTGTCG TGCCAACGGC 24120 

ATAGCTGGGT AGCTATGTGT GGACGGGATA AGTGCTGAAA GCATCTAAGC ATGAAGCCCC 24180 

CCTCAAGATG AGATTTCCCA ACTTCGGTTA TAAGATCCCT CAAAGATGAT GAGGTTAATA 24240 

GGTTCGAGGT GGAAGCATGG TGACATGTGG AGCTGACGAA TACTAATCGA TCGAAGACTT 24300 

AATCAAAATA AATGTTTTGC GAAGCAAAAT CACTTTTACT TACTATCTAG TTTTGAATGT 24360 

ATAAATTACA TTCATATGTC TGGTGACTAT AGCAAGGAGG TCACACCTGT TCCCATGCCG 24420 

AACACAGAAG TTAAGCTCCT TAGCGTCGAT GGTAGTcGAA CTTACGTTCC GCTAGAGTAG 24480 

AACGTTGCCA GGCAAAAAAT GGATGCGATG AGCCGCATTG AGACCGCAAG GTCTCTTTTT 24 540 

TTTATGTCTA AAACGTCAAA ATAAAAAGCA AACACAAAGA AAAATGGCTT GGCGAAGTGA 24600 

AAAGGTTTGA ATCTGACGAA ACGAGAAAAG ArCGCAACGA GTTTAGTAGA GCTAAATGAG 24 660 

TAAGyGAGAG CCGAAGrAGA GGAAAGAAGC AAGCGATTGT CACAAGTCAA GAAAGGTTCT 24720 

TAGCGAsGAT GGTAGCCAAC TTACGTTCCG CTAGAGTAGA ACTGGAAATG ATAATTTAAT 24 780 

AATGTACACT TTCGATTGTC TAAGTATGTA CAACTTTAAT TTTGTGTTTA TATAAATTTA 24B40 

AAATGATATC ATCGAAAACA AAATATTGTA TAAATAGAGA AGAGCAGTAA GACGGTATCT 24 900 

AATTGAAAAT GATCTTACTG CTCTTTTATA TACTTTATTG AAATACAAAA AGGAAATTAA 24 960 

TTATTATACA ATAGACAAGC TATTGCATAA GTAACACTAA CTTTTATCAA AGAAGTGTTA 25020 

CTTTATAATT AATGATTTTA TTAGAGCGTC TACATGCGGT TTTAAAGCAT CATCGTCTAT 25080 

ACCGCCAAAG CCTAATATAA ATTTAGGGGT TTTCTTATAG TCTTGATCAT CATCAAAATT 2514 0 
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TCCATTTTTT ACTGTAATTG TAAAATGCAT 
CTCTTTGTAA GGTTTCAATC TTTTTAAAAT 
5 CATTTTATTT AAATGCCTTT CAAAACCACC 

CATATGAACA GGTACAGTGT TGCCTTCAAT 
AGAATAGGGT AACACCATAT ATGCAACTCG 

10 

ACTGATATAA ATCACTTTTT CTCCTCTTGA 
GCCGAAATAT CTAAACTCGG AATCATAATC 
TTGAGCCCAT TGTATTAATT GAGTTCGTTT 

75 

TTGATGGGAA GGCGTTATAT ATACTATATT 
TATTCCATTA TCTTCAACTT CAATTTGTTC 
TTTGATTGGT GGATAACTAG GTTTTTCGAT 

20 

TAATTGATTT ACTAATTGTT CGGTAGATGA 
TACGCCACGA TTAGTAAATA AATAAAATGC 
AAAATGTCCT CTACGTAATT GATTTAAATG 

25 

TCTGAAAAGT TCTATAGGGA AATGTTTCGT 
ATAAGCTTCA TCACTCGCTT TTGGTTTATA 

3Q TTGATTGTTT AAAATTGTTA AAGATTCAAT 

TGAATAAATG TAACCTTCGT CTAATAGAAG 
AATAGATAAA TGTTTGCTTA ATTGTCTTTT 

35 ACCTTCAATT ATTTGTTTTT TTAATTTTTC 

TTTTATAACT GACCTCCTAA ATTTATCTTA 
ATTACAATGT ATTTAATCAA CTTGAAAAGG 

40 ATCAGACAGA GTCAAAAGAG GTATGGCTGA 
CGTTAATGCT GAGCAAGCAA GAATTGCAGA 
AGAACGAGTA CCTTCTGATA TTAGAGCTGC 

45 AATTGTAGAA GAAGTAATGA ATGCTGTTTC 

TCATATCACT GAAGCAAGAG TATTAGAGGC 
AGTGTTAACA CCAGCAGATG AGGAATATCA 

50 

TGTATGTGGA TGTCGTAATT TAGGTGAAgm 
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ACCCGTTTCA GCACCTTGAA TATCAAGCTG 25260 

ATAGGTTAGT TTTCTACGAT AAATTCGTCT 25320 

GGAAGATATA AACGTTGCAA TAAGGTTTTG 253 80 

GTGATTTTGA GAATGATATT TTTTCATTAT 25440 

ACAGCTAGGA AAAATAGACT TTGAAAATGT 25500 

ATATAGACCT TGAATTGCTG GAATGGGTTT 25560 

ATCTTCTATA ATAAATCGTT CTTCTTTTTC 25620 

TTTTAAGTCC ATCACATATC CAGTTGGAAA 25680 

TTTTTGTGAT TTAATAACTT CATCTACGTT 25740 

ATATTCAACT TGTTTTTTAT CTAAAATATT 25800 

AATAAATGTT GAAGTATAAA GTAAATCGAC 25860 

GCCAATTATA ATTTGATTAG GATCACAAAT 25920 

CAGTTGAAAC CGCAAATGTA ATTCTCCTTG 25980 

ATTTGTATCA TAAAGATCTT TGGAATACTT 26040 

ATCTATTTCA TCCAAATTAA AAGCATAATC 26100 

TGAATCATCA TCAAAAAGAG AGGGGATAGG 26160 

TTCGGACACA AAATATCCAG AGCGAGGTCT 26220 

TTGATATGCA TGCTCTACGG TTGTTTGGCT 26280 

AGAATAAAAT TTATCGCCTT CTTTAAATTG 26340 

ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 26400 

TTTTGTACCT TTTTAAATAT CAGTTTATAC 26460 

GGTTTTATGT ATAATGAGTA AAATTATTGG 26520 

AATGCAAAAA GGCGGCGTTA TTATGGATGT 26580 

AGAAGCTGGC GCGGTAgCAG TTATGGCATT 26640 

TGGTGGTGTT GCACGTATGG CAAACCCTAA 26700 

TATTCCAGTC ATGGCTAAAG CACGTATTGG 26760 

GATGGGTGTT GACTATATTG ATGAATCAGA 26820 

CTTAAGAAAA GATCAATTTA CAGTACCATT 26880 

TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 26940 
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ACAAGTTAAT TCAGAAGTTA GTCGATTGAC 
TGCGAAAGAT ATCGGTGCGC CTTATGAAAT 
5 ACCGGTAGTT AACTTTGCAG CTGGTGGCGT 

GGAATTAGGT GCTGACGGTG TATTCGTTGG 
AAAATTTGCT AAAGCAATTG TTCAAGCAAC 

10 

AAGATTAGCA AGTGAACTTG GCACTGCTAT 
AGAAGAACGT ATGCAAGAGC GTGGTTGGTA 
AGGTGCAGTA CGTGAACATA TTAGACATAT 
TAAAAAAGTT GAACAATTAG AAGAAATCGA 
AACGTTACGT CGATTAATGA ATTTATATGG 

20 ACCTATGTTT GGTACATGCG CAGGATTAAT 
AGGATACCTT AACAAGTTGA ATATTACTGT 
CAGCTTTGAA ACAGAATTAG ATATTAAAGG 

25 AAGAGCCCCA CATATTGAAA AAGTAGGTCA 
GAAAATTGTA GCTGTTCAGC AAGGTAAATA 
AGATGACTAT AGAGTAACTG ATTACTTTAT 

30 

TGTATGCTAA ATCAACGAAT TATTGATATT 
TCAAACTTAG CTTTGGAGGA GTTATTTTTT 
GCTATACATA AGAAAAAAAC CCTTCAAAGA 

35 

TAATTCGATG TTGATGTATT TGTTAAATAA 
ATACTAGTGT tGCACCGAAT AATAATTTCA 

40 TGTCATTAAG TGATTTAATC GCACCTGAAA 
ATACTAAGAA TACAGATGTA ACACCTTTTG 
GTGCTTGCAT TGCTACAAAT TCGTTAGATA 

45 GAACTGCATC TTGCCATGGC ACACCGACTA 
TTAATGTTTG GAAATCCCAA GAAATAGCGC 
CAATTCCATT TAATAGAGCG ATAATGGCAA 

50 

CAGCTACTTT AAATCCATCT AAAATATATT 
TTTCTTCAGT TTCTTCAACT AATAATTTGT 

55 



TGTAATGAAT GATGATGAGA TTATGACTTT 27060 

TTTAAAACAA ATTAAAGACA ATGGTCGTTT 27120 

TGCGACTCCT CAAGATGCTG CTTTAATGAT 27180 

ATCAGGTATT TTTAAATCAG AAGATCCAGA 27240 

AACACATTAC CAAGACTATG AACTAATTGG 27300 

GAAAGGTTTA GATATCAATC AATTATCATT 27360 

AGATATGAAA ATAGGTGTAT TAGCATTACA 27420 

TGAATTAAGT GGTCATGAAG GTATTGCAGT 27480 

GGGCTTAATA TTACCTGGTG GCGAGTCTAC 27540 

ATTTAAAGAG GCTTTACAAA ATTCAACTTT 27600 

AGTTCTAGCG CAAGATATAG TTGGTGAAGA 27660 

ACAACGAAAC TCATTCGGTA GACAAGTTGA 27720 

TATCGCTACA GATATTGAAG GTGTCTTTAT 27780 

AGGCGTAGAT ATCCTATGTA AGGTTAATGA 27840 

TTTAGGCGTA TCATTCCATC CTGAATTAAC 27900 

TAATCATATT GTAAAaAAAG CATAGCTTAA 27960 

TATAGATTTG TTGAGAAGAA AATATCTCCT 28020 

ATGTCAAAAT TAAAAATGAT AAAAAATAAA 28080 

GACTGAGAAT AGTCAAAATT TTGAAGGGGT 28140 

AGAATCcAGC GATTGCAGCT GAAATGAAAG 28200 

AACCAAAGCG GGCAACTGTA TCTCCTTTTT 28260 

TAATACCGAT AGAGCTAAAG TTAGCAAATG 28320 

CGTGTTCAGA TAAATCACTA AGTTTACCAA 28380 

ATAGTTTTGT CGCCATAACT GAACCGGCTT 28440 

AGAATGCAAA TGGTGCAAAG ACAAAACCAA 2 8500 

CACCTGAAAC TGTACTAAAG ATATTGCTTA 2 8560 

TGTATCCGAT TAACATTGCG CCTACAATGA 28620 

CTCCTAGCAT TTCGAAGAAT GATTGTTGTC 2 8680 

CATCTTCTTC ATTAACTTTA TAAGGGTTAA 28740 
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TAGGTTCAAT TAAGGTAAAG TATGCACCGA TAATTGAAGC AGAAACAGTC GACATTGCTG 28860 

AAGCTGTTAA TGTGTATAAA CGTTGCTTAG GTATGTATGG TAATTGTTTT TTAATTGAAA 28920 

TAAATACTTC AGATTGTCCC AAAATTGCTG CAGCAACTGC ATTGTATGAT TCTAAACGTC 28980 

CCATACCATT AATTTTAGAA ATTAAGAATC CTAAAACATT AATGATTAAA GGTAAAATCT 29040 

TTGTGTATTG AAGGATACCG ATAATCGCTG AAATAAATAC GATAGGTAAT AATACACTGA 29100 

AGAAGAATGG TGGTTGCTTA GGATCGATAT ATTGAATACC ACCGAATACA AAGTTAACAC 29160 

CATCTGCTGC TTTTAATAAT AAGTAGTTAA AACCGTTTGA AATACCACCA ATAACCTTGA 29220 

TTCCCATTGT AGTTTTAAGC AAGATAAATG CAAAGATAAG CTGAATTGCA AGTAAAATTC 2 92 BO 

CTACATATTT CCAGCGAATA TTTTTCCTGT CTGAGCTAAA TAGAAACGCA AGTGCTAAAA 29340 

AGAAGATAAT TCCGATAATC CCAATTAGAA TATGCATATA TTTCTCATTC CTTTAGTTTT 29400 

TTCTACaATc TATCATACAA TAAAATGGAA GGGCTAACAT CATAAATTTT TGAAAATATA 29460 

AAAACAAATT AATTGAAAAA GGTCAAAATA GGTCATATAA TATAGTCAAA GAAGGTCAAA 29520 

AAGGGGTGAT ATACATGCAC AATATGTCTG ACATCATAGA ACAATAaTCA AACGTTTATT 29580 

TGAAGAGTCG AATGAAGATG TCGTTGAAAT TCAGAGAGCG AATATCGCAC AGCGTTTTGA 2964 0 

TTGCGTACCA TCACAATTAA ATTATGTAAT CAAAACACGA TTCACTAATG AACATGGTTA 29700 

TGAAATCGAA AGTAAACGTG GTGGTGGTGG TTACATCCGA ATCACTAAAA TTGAAAATAA 29760 

AGATGCAACA GGTTATATTA ATCATTTGCT TCAGCTGATT GGACCTTCTA TTTCTCAACA 2982 0 

ACAAGCTTAT TATATTATTG ATGGGCTTTT AGATAAAATG TTAATAAATG AACGTGAAGC 29880 

TAAAATGATT CAAGCAGTTA TTGATAGAGA AACGCTATCA ATGGATATGG TTTCTAGAGA 29940 

TATTATTAGA GCAAATATTT TAAAACGTTT GTTACCAGTT ATAAATTATT ACTAAATGAA 30000 

ATGAGGTGTT GAAGTGCTTT GTGAAAATTG TCAACTTAAT GAAGCGGAAT TAAAAGTTAA 30060 

AGTTACAAGT AAAAATAAAA CAGAAGAAAA AATGGTGTGT CAAACTTGTG CTGAGGGGCA 30120 

CCATCCGTGG AATCAAGCTA ATGAACAACC TGAaTATCAA GAACATCAAG ATAATTTCGA 30180 

AGAAGCATTT GTTGTTAAGC AAATTTTACA ACATTTAGCT ACGAAACATG GAATTAATTT 30240 

TCAAGA 30246 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) 


SEQUENCE DESCRIPTION: , 


SEQ ID NO: 


57: 






TATTCCCCCA 


TCGGTTTATT 


AAATCGTCCA 


TTTCAATACT 


GTTTTTCCCC 


AAGATGTCGA 


€0 


TAAATCCATT 


TCAAACGCTT 


GGACGATATC 


TTGCATCGTA 


CATACATTAA 


TTTCATGTCC 


120 


TTTTAATAAT 


GCTAACTTTT 


CAACTATGTC 


TGGGTACTTA 


CGATATAAAT 


CAACAACTTG 


180 


CTCAAAATCT 


TTAGAGCCGC 


TTCGACTACT 


ACCAATCAAC 


GTTAATCCTT 


TTTCAAGTAC 


240 


TAATCGTGTA 


TTCACTTCCA 


CGGGTAATTC 


ACTTACGCCT 


AACAAAGCAA 


TACTGCCTTC 


300 


TGGTGAAATA 


TGTTCAACTA 


TTTGTTGAAG 


TGCAACTTGA 


CTTCCTTTAC 


CTCCAACACA 


360 


TTCAAATGCA 


TGATCAATTT 


TAAGATCATC 


TGGTATTTGA 


TTTACTGTAA 


AGATGTCATC 


420 


TACAAATGAA 


AAATGACTTA 


ATTTATAGTC 


TGTCTTACCA 


AATACATAAG 


TTTTAGCTTC 


480 


TGGGTACAAC 


TTACGTAGCA AAATAGCAGT 


AATATAACCT 


AAGTTACCAT 


CACCCCAAAT 


540 


ACCAAAGCTG 


GTTTTCAAAG 


GTATAGATTT 


ACGTTCAAAT 


CGTTGTATAG 


CATGATAACT 


600 


TACTGACACT 


AACTCTGTGT 


ATGAAATCGT 


ACTCAAATCA 


ATGTCATTAG 


GCAGCGGAAC 


660 


GATACGATCA 


TGTGCCATCA 


CAACGTAGTC 


TTGCATAAAA 


CCATCATAAC 


CACTAGATCT 


720 


AAAATAACTA 


GAGGCTAAGT 


AATTCTCCGC 


AATAATATGA 


TGTTGCTCTG 


TAGGTGTATT 


780 


CGGTACCATT 


ACTACTTTCG 


TACCTTTTTC 


AAATACCCCT 


TTACTATCAA 


ATACAACTTC 


840 


ACCAACAGCT 


TCATGAACTA 


ATGACATTGG 


TAATTTTTTG 


CGTAGTACAT 


TTTCATCTCT 


900 


TCGACCTGTG 


TAATACCTTT 


GATCAGCTGC 


ACAAATAGAC 


AAGTATAAAG 


GTCTTACGAT 


960 


GACATGATTA 


CCATAAATAT 


CAACATTATT 


ATATGTGACG 


TCGAACTGTC 


TCGGTGCAAC 


1020 


GAGTTGATAT 


ACTTGATTAA 


TCATCGGCAA 


TATCACCTTG 


AATAATGGCA 


TTTGCTACTT 


1080 


TTAAATCATA 


CGGTGTTGTC 


ACTTTAATGT 


TGTATAGTTC 


TCCaCGTACC 


AATTTAACTG 


1140 


CATQTCCAGA TTCGACAATG ATTTTACATG CATCTGATAA GATTTCTTTT 


TGTTCACTAC 


1200 


TTAAGGCGCG 


ATAACTATCT 


TGTAATAATT 


TAATATTAAA 


TGATTGTGGT 


GTTTGGCCTT 


1260 


GATACATTTC 


ATTCCTTACA 


GGGATACTGT 


GTATGTTCTG 


TTTATCTTTA 


GACATTACAA 


1320 


TCGTATCAAT 


TGCTTCAATG 


ACTGTATCTA 


CTGCACCATA 


TTTTGCTGCT 


ACTTCAATGT 


1380 


TCTCTTTAAT 


AATACGTTGA 


GTTAAAAATG 


GTCTTACGGC 


ATCATGAGTT 


ACAATCACAT 


1440 


CATCATTATT 


AATTCCATTT 


ACATTGCGAA 


TATGGTCGAT 


AATGTTCATA 


ATTGTTTCAT 


1500 


TTCGATCCGT 


ACCACCTGCA 


ACTACTTTGA 


CACGTTGATC 


TGTAATGTTA 


TATTTTTTTA 


1560 


AAATATCCTG 


TGTATGGGAA 


ATCCACTGTG 


CTGGCGTTGC 


GATAATAATC 


TCATTAAATT 


1620 


CACTCACTAA 


AATGAACTTC 


TCAATTGTAT 


GGATTAAAAT 


CGGTTTATTA 


TCAATATCTA 


1680 
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CTGCATAAAT CATGTTGTCC TCCATTCTGT CATTACATCA TTTCCATTTA TACATTACTG 1800 

ACCTATGCCC GCACATAAGC CTAACCTATT GCTCACTTGC CTCTTTTATT AATCCAAAGA 1860 

TAGTTGTCAC AATAGTGTGA TAATTTTTTA TAAAAATGTA TTTTTGTAAC TGACCATTCT 1920 

AAGTTGTTTT GCCATGCAGT TAATCATTAA CTCTGACGAT ATTAAATTGT TAAAGGTATT 1980 

AATGTTTACT CTTTTTCAAA TTCATTATTA CTGCCATCAT TTTACCATAT ATTATAATAA 2040 

10 

ATTTATCTTA TTAAGTGGCT GTACTTGATT TTCACTTTAA AAATTATCAA ATATTGCCAT 2100 

CTCATTTTAA GTATACAAAA TGCAAAACAA CCGATTCACA AGCATATTTC ACACAAGTAA 2 ISO 

1S ACCGGCTATT TATCAACGTA TATTCGAAGA TGAATTATTT CGATAGTATC TATAGACCAG 2220 

ACGGCATTCG CACTTTCATA GCTATAACTA TACCAGCGTT TTCGTCCTCA AAGGTGCATA 2280 

CTAATAAATC GTAAACATGA CTTTATCAAA TCGTTCTTTC TTGTTAACTA ATTTATCAAA 2340 

20 TGTCTCCGGG CCTTTTTCTA ACGGTAAAAA ATGAGAAATA ATAGGCTTTA CATTAATATC 2400 

TTTCGTCTTC ATATAATGTA AGGTTGCCGT CCACTCTTTG CCCGGAAAAT TACTGGACAA 2460 

ACAGTTCCAA GAGCCACATA CTGTCAACTC GTTACGCAGA ATTTTTTCAA AATGAACGCG 2520 

25 ATCAATCTCA ATATCATCAT ATGGTATTCC GAGTAATACC ACCTCGCCAC CTTTTTTAGG 2580 

TAGCGTCAAT ATTTGACCAA TCGTAACTTT AGCACCTGAT GATTCTATAG CTAAATCGAT 2640 

TTGATTGGCG TAATGATTTT CGATGAATTT CTCAAGATTT TCTTCTTTTG AATTGATTGT 2700 

30 

TTGATGTGCG CCCAATGATG TTGCAATATC TAGTTTATGC GCATCTATAT CTATAGCGAT 2760 

GATATGTGCA GCACCAAATA TTCGTGCCCA TTGAATAGCT AACAAACCTA TACTGCCACA 2820 

CCCCATTACT GCAACAGTCA TACCAGGTTG TATATTCGAT TTATAAAACC CATGCGCAAC 2880 

35 

AACGGCTGAT GGCTCAACCA TTGCTGCTTC AATGTAATCA ACATTGTCTG GAACCTTTAA 2940 

AACATTTTGC GCTGGCAATT TGACATATTC CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3000 

40 GACGAATAAC TTTTCACATC GTGCATATTC ACCTTTTAAA CAATACTCGC ATTGATAACA 3060 

AGGTATTGCT GGGGAACCTG TCACTTTGTC GCCCACATTA ACATGCGTAA CATCACTTCC 3120 

AATGGCATCT ACTACACCTG AAAATTCATG ACCAAATGGC ATACCTTTAA TGTATGGCCC 3180 

45 CATTTTTTTG TATCGTGACG TGTCTGAACC ACATATGCCA GTCGCTCGTA CTTTAATAAT 3240 

AACGTCATTC GCACTTTCAA TGACTGGCTT TTCATTATCC TCATACCGTA AATCTTCCAC 3300 

GCCATATAAT TTCAATGCTT TCACTTGTAA ATCACCTCAA ATTTGATTTA ATTCACAACT 3360 

SO 

TTTTTCTTTT TAAAAATACC TGTCGCAAAA TAACCTGCAA TGACAATGGA ATTACTTACG 3420 

AGTAAATGTT CCATATAAAA ATCAGTGATT TGTCTTAATG GCCCAAGCAT AAAAGTTAGC 3480 
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TGCTTTAATA 
ATATATTTTT 
AGGTTCCCAA 
CACTTTTTCC 
ACCTTTTTTG 
AACAACAACA 
TTGTTGTAAA 
GTCATTGTCC 
AAGTAATCCG 
CTTAGAAACG 
TAtTCGTCTG 
CCATCAACTT 
CAGTATTCAC 
TTAAAATCTT 
GTCACTGGAA 
CCTGCATAAT 
AGAAGTTCTA 
TCGATTTTTA 
CTTGATCAAT 
CCATATCTGT 
TCGTACCCAC 
CAACAAATGG 
CGGCTAATAA 
CTAATGCTAC 
GCCATGTTCT 
TAGGCATTAA 
TGTAACCTGC 
CAAATGCGCC 
GTACTTTTTG 



CCTTCGCCGG 
GTCACCAAAG 
TCTGCTGGCT 
ATATCAAATG 
CGTAAAATAT 
TCTGCACCGT 
TTGACTACAT 
AATCCAGTTA 
ATTGGCCCAG 
CCATGATGTG 
GAATATGATG 
GTGTTCCAAT 
ACTCATTACA 
TAACGTCTGC 
AATTAACTTT 
GTACTTTAAT 
AGTTGCCATG 
ATTGAATAGA 
ACTTGAAATT 
GAAAATGGGT 
AATGACAGAA 
TATCGTTGCT 
AACAGTGATA 
AGCCGCATCC 
TGCAGACTCT 
TACCATTACT 
TAACACACCA 
AAAACGTTTT 
TAACAATTTA 



ATTTTAAATG 
CTTCAGCATT 
TTTGACTTCT 
GAATTTCAGC 
CCAAACCTTG 
AACCGTCTGT 
AATCCATGTG 
CCACAACAGT 
GTCCCATTAC 
CACATGCTAA 
CAAACTTTCT 
ACCTTTTCGA 
AACATAGAAT 
TCCAACTTCA 
ATAATGACCT 
CTTTACTTTA 
TCCTTCTCTT 
CTAAATAGTT 
TCAGATGAAC 
GCTACGTCTG 
TGAATAATGT 
AAGTCACCAA 
GGTACTAAAA 
AATCCAATAT 
GAAACTGGCA 
GCAGCCATTG 
ATACCTAAAC 
TGAATTGTTT 
ACTAAGTAAA 



TTGATACGCC 
TACTAAACCA 
ACTACCAACA 
ATCCTTAAAA 
TCGTGCTGCT 
AATTCCATTG 
CAATGCTTCT 
TGCGCCTTTA 
AACTGCTACA 
TGGTTCTGTC 
TCACGTGCAA 
TGGTTGCATA 
GTCGTTTCAG 
ACGATTTCAC 
TCATAAGTAT 
TCATCTAGCG 
GTTTTTACTA 
TAAAGATAAG 
CTTTTGGCAT 
TTGCAATATA 
TTCCTCTTGC 
AAGGTAGTAC 
TTAATGCTGT 
AAATTTCACG 
TTAAACCTTC 
ACATTCCTAA 
CTAAAATTAA 
CAGGATCAGC 
TACCTGGTGC 



TCGTCCCATT 

TCCGCCATAA 

ACTGTTATTT 

ATACCTATTT 

GGAACTGCAC 

ATATACGTTT 

GCTTTATCTA 

CTTTTTAACA 

TCGCCTGAAT 

ATAGCTGCAG 

TGACATAATT 

AATTATAGTC 

aTGtGACACG 

CAGAAAATTC 

GAATATCTGT 

GTGTTGCAAC 

AAGCTTTCAC 

ATAGTTAACG 

TTGTACATTC 

TAGTGAAATT 

TGCACCAACA 

TTGGTTTCCT 

CGAAATAACT 

TTCGCCAAAA 

CATTAAGATT 

ATTAATGATG 

GCCGACAAAT 

ATCTAACTTA 

ATAAGAAATT 



TCGAAATATC 

GTTGCAATGA 

CTTTTTGAAT 

GACTGTAGAA 

CTGAACATTC 

TTAAGTCTGT 

ATCTGACTTT 

CTTGTGCTAC 

TGACTTGAAT 

ACTGATACGA 

AGTAAATGCG 

TTTTGATTTA 

GTCACCAACT 

ATGACCTAAT 

GCCACAAATT 

TTCTTTATCA 

CACAAACACC 

ATATTACCAC 

GTACCTTTCG 

GCAATCATAA 

ATAAACGCGA 

GGTAAAATAA 

GCTGGATGAC 

CGTTTATTTA 

TTTACCATTC 

TCTCCAGGTT 

ATAGACTCTC 

TTCAGACCGG 

GTACTTCCTG 



3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 
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CTACTTTCAA 


ACAGATAATT 


TGGAAAATAA 


CTGCTGCTAA 


TAACGCTTGC 


CAAATACTGC 


5400 


5 


CTGATACGGC 


ATAAACCATT 


GCTGCTGTAA 


ACGTATAATG 


CCAAAAATTC 


CAAATATCTA 


5460 




CATTCATCGT 


CTTTGTCACT 


TTAGTTACTA 


GCAATACAAC 


GTTAACTATG 


ATTCCGAGTG 


5520 




GAATAATAAA 


TGCTGCGACA 


GATGATGCCC 


AAGCGATAGA 


TGATGTTGCT 


GGCCAACCTA 


5580 


10 


CATCAATCAC 


ATTCAGACTG 


ACGCCTAAAT 


TTTTAACCAT 


CGCTTGTGCT 


GCTGGCCCTA 


5640 




AATTTTTAAC 


TAATAAATCG 


ATGACTAAGA 


AAATCCCTAC 


AAAAGCCACA 


CCTATTGTTA 


5700 




AACCAGACCT 


AAATGCCGCT 


CCAATTTTCT 


GCCTAAAGAA 


TAGGCCAAGC 


AAGAATATGA 


5760 


15 


CAACCGGTAA AATAACAGTt 


GCACCTAAAT 


CTAAAAATCC 


CCTTACAAAA 


TCAGTGAAGT 


5820 




AACTCATATT 


TAAACCCTCC 


CTGTTATATA 


TGCATTGTCA 


CGATACTTTC 


CGATTGTGAT 


5880 




TACATTTGAC GTTACAGTCA TTTCAACGAC AACCCTTGCT AAATTCGACT 


GCAGTCCTTT 


5940 


20 


TGAATTACAG 


tCACTGCGTT 


TCTATGTCAT 


CAACAATCAT 


TTGTCGTGAT 


AGTCATTTAT 


6000 




ATGCAATTTG 


CATATATTAA 


TATGTTATCG 


ACCCACGTTA 


CATATCAATT 


CCGTTATTTT 


6060 




TGTAACTCTG 


TTAAGATTTG 


TTGTTTTGTT 


TCTTCAATAC 


CAATACCAGT 


TAAGAAATTA 


6120 


25 


CGTGCGTTGA 


TAACTGGGAA 


TTTATATTCT 


TTTTTTGTCA 


TTGCAGTTGT 


AACTAATAAA 


6180 




TCTGCAGTGT 


CTTCATAAGG 


TCCAACTTCT 


GTAATTTTGA 


TTTGTTTAAT 


ATCTACTTTA 


6240 


30 


ATATTGTGTT 


CCTTTGCCAT 


TTCTTCAATT 


GCATTATTTA 


CTACTGTTGA 


CGTTGCAATA 


6300 


CCTGCACCAC 


ACGCTACTAA 


TACTTGTTTC 


ATTTTCAATT 


CCTCCAATTA 


ATTTTTAGTT 


6360 




ATATTCCAAA 


TAATCATTGA 


TTAGTGTTGC 


TAAAATTGTT 


TCATCTTTCG 


TTCGTAGAAT 


6420 


35 


CTGCTCCAAT 


TTTTCTTCAC 


TTTGAAAAAT 


TTGCATCAAC 


TGTTGTAACA 


GCTTAAGTTG 


6480 




ATCATCTACT TTATCCATTG CTAACATAAA AACGATTTTC ACTTCTGTCT GTTGATCAAG 


6540 




TGTTCCCATT 


TCAATAAACG 


GCACTTCTTT 


TTCTAGAACA 


GCCACACCTA 


TCGTTCTATG 


6600 


40 


GTTAATATGT TCGACATCTG TATGCGGTAT AGCGACCGAA 


CATAGATGCG 


TTGGTAAACC 


6660 




AGTAGCAAAT 


TCTTTTTCTC 


TGTCGATGAC 


TGCATCTTTA 


AACGTTGACT 


TCACGAACCC 


6720 




ATTTTGAAAT 


AACACATCTG 


ACATTTGTGA 


CAATACGGAT 


TCTTTATCAG 


TTGCCGACAA 


6780 


45 


ATTGAGCATT 


ATATTTTCTT 


TATGCACTAA TTGCTGTCCC 


ATCCATTTTC 


CCTCGCTTCT 


6840 




TTATTTGAAT AATTTTTTAA AATCTCATTT ACATCAGAAT 


TTTTGCGACT 


TTGTATGATG 


6900 


50 


CGCTTAATTG 


CGTCATTGTC 


TTGCGCCACA 


TCTCTCAATT 


GTAGTAACGC 


TCTTAAGTGT 


6960 


GTCACTTTAT 


CAACAGCAGC 


AATAGGTACA 


ATAATATGGA 


TTGCTGTGCC 


ATCTGACATG 


7020 




TATATTGGTT 


CTTGTAATAT 


CAACATACTC 


ATCGCTGTTT 


TATGTACATG 


CTTTTCAGAG 


7080 
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TGCATCTCAT 


GAATATATTT 


AATATCAATA 


AAATGATTAG 


CAACTAACAC 


ATCACTTGCT 


7200 


c 


TTAGCAATAG 


CTTCATCAAT 


ATTTTCAACA 


TGATGCATTC 


TTTTCACGTG 


CCTTGCCGGT 


7260 


o 


ATCAAGTCAG CTAAATCTAA TGyCTwATTT tGTGtGACaA 


TCGATCCATT 


AATGGTTGAA 


7320 




ATTGAATTAT 


AATTGGCAAT 


AAAATCTTCT 


AAACCATCAC 


GTAGTcTGTA 


ATGTCATTAA 


7380 


10 


CTGTCGTTGT 


GCGTTCAATT 


AATGCCATTA 


ACTTGTTTAT 


TTCCTTATCA 


ATGTCAGCCG 


7440 




ATTCCTTATT 


AATGTACTTC 


ATCACTTCTT 


TACGTAACTT 


TCGTTGCTCA 


TTTTCAGATA 


7500 




AAGCTACTTT 


TGTGATAAAT 


AATTTTTTAT 


GTGTTAGGAC 


AAACATTGGT 


GAAAAGACGA 


7560 


15 


TGTCATAATC 


TAATGTGTAA 


TTTTCAAATG 


TTCTAAGTGA 


AATCGCATCT 


AAGAAAATAA 


7620 




TTTCTGGAAA 


TAAGTTTCGC 


AACTCGTATA 


ACATCATTTG 


TGATACTGAC 


GTGCCTTGTG 


7680 




TACACACGAT 


AATAGCTTTT 


ATCTTGCCAT 


CGAAGTTTTC 


ATCTTGACGT 


CTCAAACTAC 


7740 


20 


CTCCGAACAA 


CATGGTTAAA 


TATGCTATTT 


CATTATCAGG 


CAACGATTTT 


CCGAAATATT 


7800 




CAGTTAACGA 


TTGACATGAT 


TGTTTCACCA 


TATGAAATAA 


GGATTGATAA 


TTTCCTTGTA 


7860 




AAGGATTTAT 


TAATTCATCA 


CGATCCGTTA 


AGTTATATTT 


AATCCTATAA 


AAAGCAGGCG 


7920 


25 


TTAAATGTAA 


CAAGAGTTGC 


TGTGATAATT 


TCTCCTTATC 


TTCAATGTTA 


ATAAAAGTGA 


7980 




TTTGTTCAAA 


ATGGTGAATC 


ATTTGAGCGA 


TGGCCATCGT 


TAAATTCGAT 


ATGCTATCTG 


8040 


30 


ATTCTTGCAA 


ATCAGTCCAT 


TGCACACTTG 


TTGAAAGTAA 


GTGTAATGTC 


AAATATAACT 


8100 


TTTCCGCTTC 


TGGCAAATCC 


GGCTCATGTT 


GCGTCATAAT 


CTCCGTTGCT 


TGATATTCTT 


8160 




TCGTATCCCT 


CAAATACTGA. 


TAATTAATAT 


TTAATGGATT 


CATCACATGA 


CCACTTTGAA 


8220 


35 


TTCGTCTACG 


AATCACACAA 


AGGACATAAG 


GCAATGAACT 


AAGTGATTTG 


TCTATAAAGC 


8280 




GACTCTTCAA 


AAATTGTTCT 


ACCTGTTTGA 


TCTTGTCTTT 


TTGATATGCG 


ATATCTTCGA 


8340 




ATGtfrAAGTT GAGCGCCTTT aaaacttcac ttttagtaat 


ATCATGATTC 


AACCTTTGAT 


8400 


40 


CAATCAACTT 


AATGAAGAAA 


CGGCGAACTT 


CAAATTCATC 


ACCAACAATT 


TCATAACCAT 


8460 




GTTTTCGAGA 


ATACTTAAGT 


GACAAACCAT 


GATTTTCCAA 


TTGCTCTTTC 


ACATGATTTA 


8520 




TATCGTGAAT 


GACAGTATTT 


TTACTGACTT 


GTAAATCAAT 


TGAAAAATGG 


TTTAGAGACA 


8580 


45 


TTGCGTTTTC 


CTTACTAAAA 


AGCATGAGCA 


TTAAATAATA 


ACGACGTGTT 


TCTATGCTAA 


8640 




AAATGACATT 


GTTGCCGTTT 


AACATTTGCT 


GCTCCGATAC 


ATCTCGCTTG 


AATAACGTCA 


8700 


50 


TGATTTCAGA 


ACTTACAATA 


AAATTTCCTT 


GGCTTGTTCT 


TTCAAGTTTT 


GGATAACCCT 


8760 


CTTGTTCAAG 


CCACAAATTG 


ATTTTTTGAA 


TGCGATATCC 


TAGTTGTCTA 


CGAGACAAAC 


8820 




CAAATATCGA TTCAAGTTCT 


TTACCATGAA 


TAGTAGGATT 


CAATACAATT 


TCTCTGAGTA 


8880 
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TCAATCGTCA CACCGATGTA CACACTTTGA ACACATATTT TCAAAATGAG CATGTACATC 9000 

ATTGTGATGT TTTAACAACA TTTCAATTAT ATCTATATTT TTTGTGATTT TAATCTTTTA 9060 

AAATAAAGCA ATTGAAATTT TTGCATATAT TTTTGTGTTT TGTGTTTTTT TGAAGCATTT 9120 

TTAACATACA TATCTCAATC ATTATCAAAT TGTCATGACC ATTGTAACCC AATACAAAAA 9180 

CCCTAAGGAC GCTTATATCA GGCGCCTTAG GGTTAACTGT ATCTATTTAA TTAAGTATTA 9240 

TTATTCGTAT GTACGTAACT TATGGTCTAT CAAGTTCCAC ACTTCTTCAA CATCAACTGC 9300 

TGTAGCAAAA TAAGCATTGG CAGGCTTACC TGTAACATGA TTTAAATCGA CAGCCATAGT 9360 

GCCATAAGTT AGTGGACTTT GATGTTCAAT GTCGATATTA ACGGGTACCA TTGTAAACAA 9420 

TTCTGGTTGT AACAAATACA AAATTGTACA AGCATCATGT ATTGGACCAC CATCCATATT 9480 

AAAGTGAGTC TTGTATGTCT TCTTAAAGAA TTGCAATAAT TCTACGACGA ACTGTGCAAC 9540 

AGGATTATTG ATACTTTCAA AGCGTTCAAT CACGTGATCG TCGGCTAAAA CTTGATGTGT 9600 

TACATCTAAA CCAAACACAT TTATAGTAAT CCCACTTTCA AAAACACGCT TCGCTGCTTC 9660 

AGCATCTACC CAAATATTGA ATTCTGCTGT AGGCGTCCAA TTTCCAAATG TACCACCACC 9720 

CATCAAAGTA ATAGATTCAA TATGCTCAGC GATTCTTGGC TCACGAATCA ATGCCGTTGC 9780 

TACATTCGTA AGAGGACCTG TCGCTACAAT TGTTACAGGT GTATCACTCG TCATCACTTT 9840 

GTTTATAATC ACATCTGATG CTGGCATTGC AACTGCTTGA CGTGATGGTG TCGACGGTAG 9900 

TTTCGGACCA TCTAATCCAG ATTCCCCATG TATTTCAGAA GCAAAGGCAG CTGGTTTAAT 9960 

TAACGGCCTA TCCGCACCTT TCGCTACTGC TATATCTTGG CGTCCCATAA TATCCAATAC 10020 

GTTCAAGGCG TTTGTCGTAT TCTTGTCAAC TGATTGATTA CCTGCGACTG TTGTTACAGC 10080 

TAATATCTCT AGTGGACTGT CAATTGCCCC CGCTAAAATT AATGCTATTG CATCATCGTG 10140 

TCCTCGATCA CAATCCATAA TAATCTTTCT TTTCATTTAT ATATCCACCT TTCTTAAGTT 10200 

GTTATCGATA GCTTATGTAT ATTTATTTAT GTGGTGAATC ATGTTTATTT TGAAAAATAG 10260 

TTTTAACTTT CTCATATTTT TGGATACAAA CACTATTTAT CTATTTTATG GCTTATAAAT 10320 

TTATCCGATA TGCCTTATCA ACCTACCTCG CTAAAAATAG GATGTCTACA TATCTATACC 10380 

GACTTTTGTC AACTCATTTT CACAACAATA TAAACAGCAA TTTATATGAT TGTTACATGA 10440 

TTCAAACAAT TTTTATGAAA AATATTTTCA TACACAGAAT ATATATTGAT ATTAAATTTC 10500 

TCAAAAGCTA TATTGAGAAT AATTAGGAGG GATGTTGATG AAATCTTTAT TTGAAAAAGC 10560 

ACAGCAGTTC GGCAAGTCCT TTATGTTACC TATCGCAATC TTACCAGCTG CAGGTCTATT 10620 

GTTGGGTATC GGTGGTGCAT TAAGTAATCC AAACACCGTT AAAGCATACC CTATTTTAGA 10680 
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AAATTTACCG 


i GTCATCTTTG CAATTGGTGT CGCAATCGGA TTATCTAGAA 


GCGATAAAGG 


10800 




TACTGCAGGT 


' tTAGctGCGC TGCTCGGTTT CTTAATTATG AACGCAACTA 


TGAATGGCTT 


10860 


5 


ATTAACTATC ACGGGCACAT TGGCAAAAGA TCAGCTTGCA CAAAATGGAC 


AAGGCATGGT 


10920 




GCTCGGTATA 


CAAACGGTTG AAACCGGTGT TTTTGGCGGG ATTATCACAG 


GTATTATGAC 


10980 


10 


CGCAATACTT 


CACAACAAAT ATCACAAAGT GGTATTACCA CCGTATTTAG 


GTTTCTTTGG 


11040 


TGGCTCTAGA 


TTTGTCCCTA TTGTCACAGC ATTTGCCGCA ATCTTTTTAG GTGTATTGAT 


11100 




nTTTTTrATT 

\J X X A X A. x 


TGGCCAAGCA TACAAGCCGG CATTTATCAT GTTGGTGGAT 


TTGTAACGAA 


11160 


15 


AAPAGGTGPP 


ATCGGTACTT TTGTTTATGG CTTCATCTTA AGATTGTTAG 


GTCCACTCGG 


11220 




111 /\^nv.^n J. 


ATTTTTTACT TACCGTTTTG GCAGACGGCA. CTTGGTGGTA 


CTTTAGAAGT 


11260 




P & & AP.f!^P A P 


TTAGTTCAAG GTACGCAGAA CATCTTCTTT GCTCAACTTG 


GTGATCCAGA 


11340 


20 




TATTATTCAG GTGTGTCACG CTTTATGTCA GGCCGTTTTA 


TTACGATGAT 


11400 






TGTGGTGCCG CACTTGCAAT TTATCACACA GCTAAACCTG 


AACATAAAAA 


11460 






GGTTTAATGT TATCCGCTGC ACTCACTTCA TTTTTAACAG 


GTATTACCGA 


11520 


25 




TTTAGTTTCT TGTTTGTCGC ACCTATTCTT TATGTAATCC 


ATGCCTTCTT 


11530 




TH ATP^"J ATT A 


GCATTTATGA TGGCAGACAT TTTCAACATT ACAATTGGTC 


AAACCTTCAG 


11640 




TGGAGGCTTT 


ATCGATTTCT TACTCTTTGG TGTGCTACAA GGTAATAGTA 


AAACAAACTA 


11700 


30 


C CTATACGT C 


ATACCTATTG GAATTGTGTG GTTCTGTTTG TATTACATCG 


TTTTCAGATT 


11760 




CTTAATTACG 


AAATTTAATT TCAAAACACC TGGTCGAGAA GATAAAGCTG 


CAGCACAACA 


11820 


35 


AGTTGAGGCT 


ACTGAAAGAG CACAAACTAT TGTTGCTGGT TTGGGAGGCA 


AAGATAACAT 


11880 


TGAAATCGTT 


GACTGTTGTG CAACGAGACT ACGCGTCACA CTTCATCAAA 


ATGACAAAGT 


11940 




CGATAAAGTA 


TTACTCGAAA GTACTGGTGC CAAAGGTGTA ATCCAGCAAG 


GCACTGGTGT 


12000 


40 


GCAAGTAATT 


TATGGGCCTC ACGTTACAGT TATCAAAAAT GAAATTGAAG 


AATTGCTCGG 


12060 




GGATTAAGAC 


TAACCGAAAT ATCAACAGAA CTAATGGCAA CGATGTACGA AGTAAGAAGT 


12120 




GACATCGTTG 


CTTTTATTTT TAATGTTACA TTTGAAGCAT TAAGTTCATC 


ATGCACTGTA 


12180 


45 


GTGAGCCCGC 


AAATCGCCTC TGCTAGACAA TCATCTTAAT GCTATGATTA AAGCTTAAGT 


12240 




GCCAGATTTG 


AATTTAATTT CAACAACGAC TTTCACTACA TTAAAAATAG 


GGCCACTCGA 


12300 




CACATATAGT 


TGTATCAAAT AGCCCTTTAT ACAATTTTTT GGGTAAGGTT 


TTACAATTTT * 


12360 


50 


TGGGATGGTA 


TAGATTTTAT AAAAAGTTAT TTAAGTTCTT CTGCTTCAGC 


CATAATATCT 


12420 




TTTAATGTTT 


TAGCTGAATG TGCGAACTTG CTTTGTTCTT CGTCGTTTAA 


TGGGATTTCT 


12480 
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TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 12600 

ATCGCTTCAG TAATTCTAGC TAATCCCATT GCAACACCAT AATAAGTGGC ACCTTTAGCT 12660 

5 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTTG .12720 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTTGAC CCGCAATATT AGCGTGTGAC 12780 

CATACTGGTA ATTCAGTGTC ACCATGTTCA CCAATAATTT GAGCATCGAC GCTACGTGGC 12840 

10 

GCAACATCGn AcgyTcGCTT AACAATAATC TAAAGCGTGC AGAGTCTAAA ATTGTACCAG 12900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTGCATAC GCTAAAATAT 12960 

1S CAACAGGATT TGTAGCTACC AAGAAAATAC CATCAAATTT TGATGCCATT ACTTCACCAA 13020 

CAATTGATTT GAATATTTTC AAGTTTTTAG ATACTAAATC TAAACGTGTT TCTCCAGGTT 13080 

TTTGTGCAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATTCGC 13140 

20 CAGCTTTCAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA TCCATAACAT 13200 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCACCATT ACCTATTAAT ACAACTTTGT 13320 

25 TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 13380 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 13440 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 13500 

30 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13560 

TCACATTTTA TTTTAATTTT TACACCTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13 620 

TTTAGTCTTC GTACTTCGCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 136 80 

35 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG CTCAATCGTC 1374 0 

GGTGTGCAAA CAGACAATTG TACACAATGC TTATTGATAA GTATTTAAAA AATTAAAAAT 13800 

40 GTCATACAAT TATCAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13 860 

TCGAAATTTT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13 920 

GAACAAGAGA AAGACATTAT CAAACAAACG GTGCCTTTAC TTAAAGAGAA AGGGACAGAA 13980 

45 ATTACGTCAA TCTTTTATCC AAAAATGTTT AAAGCGCATC CTGAACTTTT AAACATGTTT 14040 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTTCAGCAT TAGCACAAGC TGTAATGGCC 14100 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 14160 

50 

AAACACTGCG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14220 

AAAGCCATTC AAGACGTGAC AGGATTAGAA GAAAATGACC CTGTCATTCA AGCTTGGGCA 14280 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: , 
5 (A) LENGTH: 8779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

GGTATTTTnG GAnGGGTACC TAAAGCAATT CCGGCAAAGG GTnAATCCAG GTACCGAAAT 60 

, 5 GGACTTCCCG TTATCGATAA TACCGACATA TATTGTGACA AGTAGATTTT ATGGACATTT 120 
AGGCTTACTT TTACTTGTGA TAATTGCATG TATGTTTACT GGTATTTAtC CaTCaATACA 1 180 

TATCATTCAA TTATTGATAT ATGTACCGTT TTGTTTTTTC TTAACTGCCt CGGTGACGTT 240 

20 ATTAACATCA ACACTCGGTG TGTTAGTTAG AGATACACAA ATGTTAATGC AAGCAATATT 3 00 

AAGAATATTA TTTTACTTTT CACCAATTTT GTGGCTACCA AAGAACCATG GTATCAGTGG 360 

TTTAATTCAT GAAATGATGA AATATAATCC AGTTTACTTT ATTGCTGAAT CATACCGTGC 420 

AGCAATTTTA TATCACGAAT GGTATTTCAT GGATCATTGG AAATTAATGT TATACAATTT 4 80 

CGGTATTGTT GCCATTTTCT TTGCAATTGG TGCGTACTTA CACATGAAAT ATAGAGATCA 540 

ATTTGCAGAC TTCTTGTAAT ATATTTATAT GACGAAACCC CGCTAACCAT TAATAAATGG 600 

AAGTGGGGTT CATTTTTGTT TATAATTTAA GTAAATAACA TATTAAGTTG GTGTATTATG 660. 

AACGTTTTAA TAAAGAAATT TTATCATTTG GTAGTTCGAA TACTTTCTAA AATGATTACG 720 

CCTCAAGTGA TTGATAAACC GCATATCGTA TTTATGATGA CTTTTCCAGA AGATATTAAG 780 

CCTATCATCA AAGCATTAAA TAATTCGTCG TATCAGAAAA CTGTTTTAAC AACACCAAAA 840 

CAAG&GCCTT ATTTATCTGA ACTTAGCGAC GATGTTGATG TGATAGAAAT GACTAATCGA 900 

40 . ACATTGGTAA AACAAATTAA GGCTTTGAAA AGCGCGCAGA TGATTATTAT CGATAATTAT 960 

TACCTATTGC TAGGTGGATA TAATAAGACT TCTAATCAAC ACATTGTTCA AACGTGGCAT 1020 

GCAAGTGGTG CATTAAAAAA CTTTGGCTTA ACAGATCATC AAGTCGATGT GTCTGACAAG 1080 

GCAATGGTTC AGCAGTACCG TAAAGTTTAT CAAGCGACGG ATTTTTACTT AGTGGGTTGT 1140 

GAACAAATGT CACAATGTTT TAAACAGTCT TTAGGTGCAA CAGAAGAGCA AATGCTGTAT 1200 

TTTGGGCTTC CGAGAATTAA TAAATATTAC ACAGCTGATA GAGCAACGGT TAAGGCAGAG 1260 

TTAAAGGATA AATATGGAAT TACAAATAAG TTGGTATTAT ATGTACCAAC ATATAGAGAA 1320 

GATAAAGCAG ATAATAGGGC TATTGATAAA GCTTATTTTG AAAAATGTTT ACCAGGATAT 1380 
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ATCGACACGT CTACATTAAT GCTAATGTCA GATATAATTA TTAGCGACTA TAGTTCGCTG 1500 

CCAATAGAAG CTAGCTTGTT AGATATTCCA ACTATATTTT ATGTGTATGA TGAAGGAACA 1560 

TATGATCAGG TGAGAGGCCT GAATCAATTT TACAAAGCAA TACCGGATAG CTACAAAGTG 1620 

TATACTGAAG AAGATTTAAT AATGACGATA CAAGAAAAAG AACATCTATT AAGTCCGTTA 16 80 

TTTAAAGATT GGCATAAGTA TAATACTGAT AAAAGTTTAC ATCAGCTCAC AGAATATATA 1740 

GATAAGATGG TGACAAAATG AGGTTTACGA TAATCATACC TACATGTAAT AATGAGGCAA 1800 

CAATTCGACA ATTGTTAATA TCTATTGAGA GTAAAGAACA CTATAGAATC CTTTGTATTG 1860 

ATGGTGGTTC TACTGATCAA ACAATTCCTA TGATTGAACG GTTACAAAGA GAACTCAAGC 1920 

ATATTTCATT AATACAATTA CAAAATGCTT CGATAGCTAC GTGTATTAAT AAAGGTTTGA 1980 

TGGATATCAA AATGACAGAT CCACATGATA GTGACGCATT TATGGTCATA AAACCAACAT 204 0 

CAATCGTATT GCCAGGTAAA TTAGATAGGT TAACTGCTGC TTTCAAAAAT AATGATAATA 2100 

TTGATATGGT AATAGGGCAG CGAGCTTACA ATTACCATGG TGAATGGAAA TTGAAAAGTG 2160 

CTGATGAGTT TATTAAAGAC AATCGAATCG TTACATTAAC GGAACAACCA GATTTGTTAT 2220 

25 CAATGATGTC TTTTGACGGA AAGTTATTCA GTGCTAAATT TGCTGAA1TA CAGTGTGaCG 22 80 

AAACTTTAGC TAACaCATAC AATCACGCAA TACTTGTCAA GGCGATGCAA AAAGCTACGG 234 0 

ATATACATTT AGTTTCACAG ATGATTGTCG GAGATAACGA TATAGATACA CATGCTACAA 24 00 

30 GTAACGATGA AGATTTTAAT AGATATATCA CAGAAATTAT GAAAATAAGA CAACGAGTCA 2460 

TGGAAATGTT ACTATTACCT GAACAAAGGC TATTATATAG TGATATGGTT GATCGTATTT 2520 

TATTCAATAA TTCATTAAAA TATTATATGA ACGAACACCC AGCAGTAACG CACACGACAA 2580 

TTCAACTCGT AAAAGACTAT ATTATGTCTA TGCAGCATTC TGATTATGTA TCGCAAAACA 264 0 

TGTTTGACAT TATAAATACA GTTGAATTTA TTGGTGAGAA TTGGGATAGA GAAATATACG 2700 

AATTGTGGCG ACAAACATTA ATTCAAGTGG GCATTAATAG GCCGACTTAT AAAAAATTCT 2760 

TGATACAACT TAAAGGGAGA AAGTTTGCAC ATCGAACAAA ATCAATGTTA AAACGATAAC 2820 

GTGTACATTG ATGACCATAA ACTGCAATCC TATGATGTGA CAATATGAGG AGGATAACTT -2880 

AATGAAACGT GTAATAACAT ATGGCACATA TGACTTACTT CACTATGGTC ATATCGAATT 2940 

GCTTCGTCGT GCAAGAGAGA TGGGCGATTA TTTAATAGTA GCATTATCAA CAGATGAATT 3000 

TAATCAAATT AAACATAAAA AATCTTATTA TGATTATGAA CAACGAAAAA TGATGCTTGa 3060 

SO ATCAATACGC TATGTCGATT TAGTCATTCC AGAAAAGGGC TGGGGACAAA AAGAAGACGA 3120 

TGTCGAAAAA TTTGATGTAG ATGTTTTTGT TATGGGACAT GACTGGGAAG GTGAATTCGA 3180 
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TAAAATCAAA 


CAAGAATTAT ATGGTAAAGA TGCTAAATAA ATTATATAGA ACTATCGATA 


3300 




CTAAACGATA AATTAACTTA GGTTATTATA 


AAATAAATAT 


AAAACGGACA 


AGTTTCGCAG 


3360 


5 


CTTTATAATG 


TGCAACTTGT 


CCGTTTTTAG 


TATGTTTTAT 


TTTCTTTTTC 


TAAATAAACG 


3420 




ATTGATTATC 


ATATGAACAA 


TAAGTGCTAA 


TCCAGCGACA 


AGGCATGTAC 


CACCAATGAT 


3480 




AGTGAATAAT 


GGATGTTCTT 


CCCACATACT 


TTTAGCAACA GTATTTGCCT TTTGAATAAT 


3540 


10 


TGGCTGATGA ACTTCTACAG 


TTGGAGGTCC 


ATAATCTTTA 


TTAATAAATT 


CTCTTGGATA 


3600 




GTCCGCGTGT 


ACTTTACCAT 


CTTCGACTAC 


AAGTTTATAA 


TCTTTTTTAC 


TAAAATCACT 


3660 


15 


TGGTAAAACA 


TCGTAAAGAT 


CATTTTCAAC 


ATAATATTTC 


TTACCATTTA 


TCCTTTGCTC 


3720 


ACCTTTAGAC 


AATATTTTTA 


CATATTTATA 


CTGATCAAAT 


GAGCGTTCCA 


TTAATGCATT 


3780 




CCCCATCATA TTACGTTGCT TCTCGCCACC AAGGTTTTTA TAGTCTCCTG CACCCATGAT 


3840 


20 


AACTTGATTA 


ATTCTAAATT 


TACCTCGTTT 


GGTAGTAATC 


GTATGGTTGT 


AATTTGCTGT 


3900 




ATCACTTGAT 


CCAGTTTTTA AACCATCTGT ACCCGGCAAA 


CTCATTTTTG 


CACCTTCCAA 


3960 




TGAAAAGTTG 


AATGTGTAAT 


ACGTAACTGC 


ATGCGTTGTT 


GGTGCTAACT 


GCTTTGTAAA 


4020 


25 


GTCTAATATT 


TTAGGTGTCT 


CTTTAATCAC 


GTGTAAATCT 


AAAATGGCAT 


AGTCTCTAGC 


4080 




AGTCGTTACA 


GTACGTTCTT 


GGTCTTTATA 


CTTTGTTGGT 


GCAAATGTAC 


GTAATCTTGA 


4140 




ATTTTCAGCA 


CCCGTTGGAT 


TGACGAAATG 


TGTATTTTTC 


ATTCCGATAG 


CTTTAGCTTT 


4200 


30 


GTTATTCATT 


AAATCAACGA 


AATCGCTGGT 


GTTTTTTGAA 


ACCTTCTTAG 


CTAAAATTAA 


4260 




TGCCGCGGCA 


TTACTAGAAT 


TAGATACTGT 


AATTTGTAAT 


AGGTCTGCGA TTGTCCATAC 


4320 




TTGTCCAGGA 


TATAGTTTCG 


TATTACTCAA 


CTCAGGTAGT 


GTAGACATAA 


TATATTCTTT 


4380 


35 


GTTCGTCATT 


GTGACTGTGT 


CATCAAGTGA AAGCTGCCCC 


TTATTTACAG 


CTTCCAATGT 


4440 




TAAGTACATT 


GTCATTAATT 


TAGTCATAGA 


CGCTGGAtTC 


CACTTAGTAT 


CGATATTGTA 


4500 


40 


TTGATACAGT 


AATTGTCCAG 


TTTGACTTAC ATTAACAGCA 


CTCGTCGGTT 


CGTATGCAGC 


4560 


CGACAAACCT 


GCATAACCAT 


ATTGATTTGC 


TGCTTGTACA 


GGGGTTACGT 


CACTGTTAGT 


4620 




AGCTTGTGCA 


TATGGTGTCA 


TAATACTTAA 


TGTTAAACAT 


AAAATGATGA 


TAATAGATAT 


4680 


45 


TAAATTTTTC 


ATAAAGCGTT 


AATCTTCCCT 


TTTCCAATTC 


TTAAATATTC 


CCTAAAAGCA 


4740 




ATGGTTATTC 


CTACTTACGG 


AAATCATTGC 


TAATTCACTT 


CACCTTAATT 


AAATTGTTGA 


4 800 




AAATAAAGTT 


TTCTGCAGTT 


AATTTGAAAA 


ATAATGCAAA 


TATATTACGT 


GTGTAGCTAA 


4860 


50 


AGGTGTTATA 


ATGTTTGTAC 


GAAGAGCAAA 


CTTACTCAAA 


AGCGATTAAT 


TTTCATGTTT 


4920 




TAATATAAAG 


ACTTTGAGAA 


GTTATTACAA 


AAAATGCAAT 


AGAAATATTC 


TATCATATAA 


4980 
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AAGTATATGA TAGAAATGCA TGTATCTATC 
AGAGGTAAAA CTATGAAACG AGAAAATCCA 
5 CCAGTGGGTC TTATCGTTGC AGCTATCACT 

TTAGTGCCAC TGTTTACTGG ACGAATTGTA 
AATCtAATCG CATTATTTGG TGGTATCTTT 

10 

TTATATTTAT TAAGTAAAAT TGGTGAAAAG 
GAGCATATCA TACAATTAAA AATGCCATTC 
AGTCGATTAA CTGACGATAC GAAAGTGATA 

15 

TTATTACCAT CAATCGTTAC ATtAGTTGGG 
AAAATGACAT TATTAACATT TATAACGATA 

20 GGTCGTATTA TGCAAAAGAT ATCGACAAGT 

TTGTTAGGGC GTGTCCTAAC TGAAATGCGT 
GAATTAGATA ATGCACATAA AAATTTGAAT 

25 AAAATTGCGG CAGTTGTACA ACCAATTTCA 

ATTTTAGGTT TTGGTGCATT AGAAATTGCG 
GCAATGATAT TTTATGTTAT TCAGTTATCT 

30 ACAGATTATA AAAAGGCAGT CGGTGCAAGT 

ATTGAACCGA CAGAAGCTCT TGAAGATTCT 
TCATTTGAAC ATGTAGACTT TAAATATGAT 

35 

CAAATCCCAC AAGGTCAAGT GAGTGCTTTT 
ATATTTAATC TGATAGAACG TATGTATGAA 
GAAAGTGTCT ATGATATCCC GTTATCTAAG 

40 

TCAAATTCGA TGATGAGTGG TACAATTAGA 
GTTTCAGATG AAGAACTTAT TAATTATGCT 

4£ CAATTTGATG AAGGATATGA CACGCTTGTA 

CAACGTCAAC GTATTGATAT TGCTAGAAGT 
GATGAAGCAA CAGCTAATCT CGATAGTGAA 

50 ACATTGATGG AAGGTAGAAC AACGATTGTC 

GCCGGTCAAA TTATATTCTT AGACAAAGGA 

55 



TAAATGAATT AACTATAAAT TTCAAACAGA 5100 

TTGTTTTTCT TATTTAAAAA ACTATCATGG 5160 

ATTTCATCAC TAGGGAGCTT AAGTGGACTA 5220 

GATAAATTTT CCgTGAGCCA TATCAATTGG 5280 

GTCATCAATG CTTTATTAAG CGGATTAGGT 534 0 

ATTATTTATG CGATACGCTC AGTTTTATGG 5400 

TTTGACAAAA ATGAAAGTGG TCAATTAATG 5460 

AATGAATTTA TTTCACAAAA GCTACCTmAC 5520 

TCACTAATCA TGTTATTTAT TTTAGATTGG 5580 

CCGATATTCG TTTTaATTAT GATTCCTCTA 564 0 

ACACAATCTG AAATTGCAAA CTTCAGTGGT 5700 

CTTGTTAAAA TATCAAATAC AGAGCGTCTT 5760 

GAAATATATA AATTAGGTTT AAAACAGGCT 5820 

GGTATAGTTA TGTTGCTAAC AATTGCAATT 5880 

ACTGGTGCAA TCACTGCAGG TACATTAATT 594 0 

ATGCCTTTAA TCAATCTTTC CACGTTAGTT 6000 

AGTAGAATAT ACGAAATCAT GCAAGAACCT 6060 

GAAAATGTAT TAATTGATGA CGGTGTATTG 6120 

GTGAAGAAAA TATTAGATGA TGTGTCGTTC 6180 

GTAGGCCCTT CTGGGTCTGG TAAAAGTACG 624 0 

ATTGAGTCAG GTGATATTAA ATATGGCCTT 6300 

TGGCGACGCA AAATTGGATA TGTTATGCAA 636 0 

GACAATATTT TATACGGAAT TAATCGTCAT 6420 

AAATTAGCGA ACTGTCATGA TTTTATCATG 64 80 

GGTGAACGAG GATTGAAACT GTCTGGCGGA 654 0 

TTTGTTAAAA ATCCTGATAT TTTGTTACTT 6600 

AGTGAATTGA AAATTCAAGA AGCTTTAGAA 6660 

ATTGCGCATC GTTTGTCTAC AATTAAAAAA 6720 

CAGGTAACAG GTAAAGGTAC GCATTCAGAA 6780 
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TTTTATATAT 


ATAAGTAAGC 


TTGGAGCAAA 


TACACATATA 


CCATCGAGGA 


AATTAAAGTG 


6900 




TGGCACATTG 


ATGGATATAG 


ATGTTAATAA 


ATTGCTTCAA 


GCTTTTGTCT 


ATTTTAAATC 


6960 




ATTTGAGAAG 


TTACGACATA 


ATAATTCTTA AATTAATGAA ATCGATATTT TAAGAAAAAA 


7020 




ATGCTCATGG 


TATAATACAA 


GTTATAAGCA 


AACATACATA 


TATTAAATAC 


TGTAGCCACG 


7080 




AGTCATAATT 


CTTCATATTT 


TACATAGCAA 


TTTAACTGAT 


TTTAGAGTCC 


ACGGTACAGA 


7140 


10 


AGTTTGATAT 


TTCAATGTTT 


CTAAATTTTT AAAAAATTAA 


ATCATAGGTG 


GGTGCCAAAT 


7200 




GTTTTTATTA ATCAACATTA 


TTGG TCTAAT 


TGTATTTCTT 


GGTATTGCGG 


TATTATTTTC 


7260 


15 




& ft ft ft TV T»ft TT^f 

AAAAATATCC 


AATGG CAATC 


AATTGGGATC 


TTAGTTGTTT 


TAAACCTGTT 


7320 


*T*T T* ft. P P ft. TW 


TTCTTTATTT 


ATTTTGATTG 


GGGTCAAAAA 


GCAGTAAGAG GAGCAGCCAA 


7380 




TGGTATCGCT 


TGGGTAGTTC 


AGTCAGCG CA 


TGCTGGTACA 


GGTTTTGCAT 


TTGCAAGTTT 


7440 


20 


GACAAATGTT 


AAAATGATGG 


A I ATGGCTGT 


TGCAGCCTTA 


TTCCCAATAT 


TATTAATAGT 


7500 




GCCATTATTT 


GATATCTTAA 


TGTACTTTAA 


TATTTTAC CG 


AAAATTATTG 


GAGGTATTGG 


7560 




TTGGTTACTA 


GCTAAAGTAA 


PA ft^"'ft/™ , 7Vft/"l^"' 

CAAGACAACC 


TAAATTCGAG 


TCATTCTTTG 


GGATAGAAAT 


7620 


25 


GATGTTCTTA 


GGAAATACTG 




CGTATCAAGT 


GAGCAACTAA 


AACGTATGAA 


7680 




TGAAATGCGT 


GTATTAACAA 


TCCZ P & A TP. A T 


GTCAATGAGC 


TCTGTATCGG 


GAGCTATTGT 


7740 




AGGTGCGTAT 


GTACAAATGG 


T ap PRf^n a p. a 


ACTGGTACTA 


ACGGCAATTC 


CACTAAATAT 


7800 


30 


CGTTAACGCG 


ATTATTGTGT 


CATGCTTGTT 


GAATCCAGTA 


AGTGTTGAAG 


AGAAAGAAGA 


7860 




TATTATTTAC 


AGTCTTAAAA 


ACAATGAAGT 


TGAACGTCAA 


CGATTCTTCT 


CATTCCTTGG 


7920 




AGATTCTGTA 


TTAGCAGCAG 


GTAAATTAGT 


ATTAATCATC 


ATCGCATTTG 


TTATTAGTTT 


7980 


35 


TGTAGCGTTA 


GCTGATCTAT 


TTGATCGTTT 


TATCAATTTG 


ATTACAGGAT 


TGATAGCAGG 


8040 




. ATGGXTAGGC ataaaaggta 


GTTTCGGTTT 


AAACCAAATT 


TTAGGTGTGT 


TTATGTATCC 


8100 


40 


ATTTGCGCTA 


TTACTCGGTT 


TACCTTATGA 


TGAAGCGTGG 


TTGGTAGCAC 


AACAAATGGC 


8160 


TAAGAAAATT 


GTTACAAATG 


AATTTGTTGT 


TATGGGTGAA ATTTCTAAAG 


ATATTGCATC 


8220 




TTATACACCA CACCATCGTG 


CGGTTATTAC 


AACATTCTTA ATTTCATTTG 


CAAACTTCTC 


8280 


45 


AACGATTGGT 


ATGATTATCG 


GTACATTGAA 


AGGCATTGTT 


GATAAAAAGA 


CATCAGACTT 


8340 




TGTATCTAAA 


TATGTACCTA 


TGATGCTATT 


ATCAGGTATC 


CTAGTTTCAT 


TATTAACAGC 


8400 




AGCTTTCGTT 


GGTTTATTTG 


CATGGTAATA 


TGTCGAAGAG 


TGACTATGAT AATACATTTT 


8460 


SO 


AACTAATAAA 


TATGTCCAGG 


CATGTCGTCT 


ATTGATATAG 


GTGAGATGCT 


TGGACTTTTT 


8520 




TATTATTGAT 


ATAAAGGTAT 


nTAAATATTT 


TTAAAGTTAC 


CGAAATTGAA 


GCATTATAAA 


8580 
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GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 8700 

CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA ACATTATAGC AGACAATGAA 8760 

5 ATGAAAGTAA ATTAAAAAT 8779 

(2) INFORMATION FOR SEQ ID NO: 59; 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 31096 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 





GTTGCAGTAG 


TCAAAGAATT 


AAACAAGGTG 


AAGGcGTGTA 


GCTTGCACAC 


CCGAAAATGT 


60 


20 


GCGTAAGTTA 


aCGGATGCAG 


GACATAAAGT 


AATTGTTGAA 


AAAAATGCTG 


GCATTGGTTC 


120 




AGGATTTTCT 


AACGATATGT 


ATGAAAAAGA 


AGGCGCTAAG 


ATCGTAACTC 


ACGAACAAGC 


180 




ATGGGAAGCT 


GATCTTGTTA 


TCAAAGTAAA 


AGAACCTCAT 


GAAAGCGAAT 


ATCAATATTT 


240 


25 


CAAAAAGAAT 


CAAATTATCT 


GGGGATTTTT 


ACATCTAGCA 


TCTTCAAAAG 


AAATAGTAGA 


300 




AAAAATGCAA 


GAAGTTGGTG 


TAACTGCGAT 


TAGTGGTGAA AC CATTATAA 


AAAATGGAAA 


360 




AGCAGAATTA 


TTAGCGCCAA 


TGAGTGCTAT 


AGCAGGTCAA 


CGCTCAGCAA 


TTATGGGAGC 


420 


30 


TTACTACTCT 


GAAGCACAAC 


ATGGTGGTCA 


AGGTACTTTA 


GTGACTGGTG 


TACATGAAAA 


480 




TGTGGATATA 


CCTGGTAGTA 


CATATGTGAT 


TTTCGGTGGT 


GGAGTAGCAG 


CAACAAATGC 


540 


35 


AGCAAATGTT 


GCCTTGGGAC 


TAAATGCTAA 


AGTAATCATT 


ATCGAGTTAA 


ACGATGACCG 


600 


CATTAAATAT 


CTTGAAGATA 


TGTATGCAGA 


AAAAGATGTC 


ACAGTAGTCA 


AATCAACACC 


660 




AGA^AATTTA GCAGAACAAA 


TTAAGAAAGC 


AGATGTATTT 


ATTTCTACAA 


TTTTAATTTC 


720 


40 


AGGTGCGAAA 


CCGCCAAAAT 


TGGTTACTCG 


TGAGATGGTT 


AAATCAATGA 


AAAAAGGTTC 


780 




AGTATTAATC 


GATATAGCTA 


TTGACCAAGG 


TGGAACTATT 


GAAACAATTA 


GACCAACTAC 


840 




AATTTCTGAT 


CCAGTGTATG 


AAGAAGAAGG 


TGTGATTCAT 


TATGGTGTAC 


CAAATCAACC 


900 


45 


AGGAGCAGTC 


CCAAGAACTT 


CAACAATGGC 


ATTAGCACAA 


GGAAATATTG 


ATTATATATT 


960 




AGAAATTTGT 


GACAAAGGCT 


TAGAACAAGC 


AATTAAAGAT 


AATGAAGCCT 


TAAGTACTGG 


1020 




TGTAAACATT 


TACCAAGGAC 


AAGTGACAAA 


TCAAGGATTA 


GCTTCATCAC 


ATGACCTAGA 


1080 


SO 


TTATAAAGAA 


ATATTAAATG 


TTATCGAATA 


GATAGTAATT 


TAAATGAAAT 


TGAGTGAAAT 


1140 




GAATATTTTA 


AATAT AG CAT 


TATAGTTTGG 


ACTAAAAATT 


TACAAAACGG 


AAGGATGTAA 


1200 
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TCGAAGAAGC TAAAGCAAGC ATTAAACCAT TTATTCGTCG AACACCTCTA 


, ATTAAATCAA 


1320 




TGTATTTAAG CCAAAGTATA ACTAAAGGGA ATGTATTTCT AAAATTAGAA AATATGCAAT 


1380 


5 


TCACAGGATC TTTTAAATTT AGAGGCGCTA gCAATnAAAA TTAATCACTT AACAGATGAA 


1440 




CAAAAAGAAA AAGGCATTAT CGCAGCATCT GCTGGGg AAC CATGCACAAG 


GTGTTGCTTT 


1500 




AACAGCTAAA TTATTAGGCA TTGATGCAAC GATTGTAATG CCTGAAACAG 


CACCACAAGC 


1560 


10 


GAAACAACAA GCAACAAAAG GCTATGGGGC AAAGGTTATT TTAAAAGGTA AAAACTTTAA 


1620 




CGAAACTAGA CTTTATATGG AAGAATTAGC GAAAGAAAAT GGCATGACAA 


TCGTTCATCC 


16B0 


15 


ATATGACGAT AAGTTTGTAA TGGCAGGCCA AGGAACAATT GGTTTAGAAA TTTTAGATGA 


1740 


iAXn\jQrAAT GTGAATACAG TCATCGTACC AGTTGGCGGT GGAGGATTAA 


TTGCAGGTAT 


1800 




TGCCACCGCA TTAAAATCAT TTAACCCTTC AATTCATATT ATCGGTGTTC 


AATCTGAGAA 


1B60 


20 


1 \j 1 1 LA I vjCST ATGGCTGAGT CTTTCTATAA GAGAGATTTA A CTG AACATC 


GAGTGGATAG 


1920 




lalaaiavj-la WliWjTTGTG ATGTAAAAGT TCCTGGTGAA CAAACATATG 


AAGTAGTTAA 


1980 




rtLHiiiM^iA l i A IluTTUTTAC TGAAGAAGAA ATTGAACATG 


CTATGAAAGA 


2040 


25 


a i i/w\ivjwuj i\jLLftAAA TTATTACTGA AGGTG CAGGC GCATTACCAA 


CAGCTGCAAT 


2100 




iiinnuiuun flMnftlArtnLH nl/Willjuv.1 1 vjAAoA I AAA AATGaTGTTG 


CATTAGTTTC 


2160 




AGGCGGGAAT GTTY^ArTTAA rTHfiaflTTTr ^^TTTrsTT r>nnriifrr<r<Ko 


TGAATATTGC 


2220 




AGATACAAGC AAGGGTGTGG TAGGTTAAAA CATTTAATCT TAAAAATGAG 


GTGTAATTAT 


2280 




GTCAAATGGT AAAGAATTAC AAAAAAATAT AGGTTTCTTC TCAGCGTTTG 


CTATTGTTAT 


2340 




GGGGACAGTT ATTGGTTCAG GAGTATTCTT TAAAATATCA AACGTAACAG 


AAGTAACAGG 


2400 


35 


AACAGCAGGA ATGGCCTTGT TTGTATGGTT CCTAGGCGGC ATCATTACCA 


TTTGTGCGGG 


2460 




GTTAACAGCA GCAGAACTTG CTGCTGCAAT CCCTGAAACA GGTGGCTTAA 


CGAAGTATAT 


2520 


40 


AGAATATACA TACGGTGATT TCTGGGGCTT CCTATCAGGT TGGGCGCAAT 


CATTTATTTA 


2580 


TTTTCCAGCT AACGTAGCAG CATTGTCTAT CGTATTTGCG ACACAGCTAA 


TTAATTTATT 


2640 




CCATTTATCT ATAGGTTCGT TAATACCAAT AGCAATCGCA TCTGCGTTAT 


CTATTGTGTT 


2700 


45 


GATAAATTTC CTAGGTTCAA AAGCAGGCGG AATTTTACAA TCAGTTACTT 


TAGTAATTAA 


2760 




ACTGATTCCA ATCATCGTTA TTGTAATTTT TGGTATTTTT CAATCTGGAG 


ATATCACTTT 


2B20 




TTCATTAATT CCAACTACAG GTAATTCaGG AAATGGCTTC TTTACAGCAA 


TTGGTAGTGG 


2880 


50 


TTTATTAGCA ACTATGTTTG CATATGATGG TTGGATTCAT GTAGGAAATG 


TTGCGGGGGA 


2940 




ACTTAAAAAT CCTAAACGCG ATTTACCTTT AGCGATTTCA GTTGGTATCG 


GTTGTATTAT 


3000 



55 
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TGGTAATTTA AATGCAGCTT CAGATACATC AAAAATATTA TTTGGTGAAA ATGGCGGTAA 3120 

GATTATTACA ATCGGTATAT TAATTTCTGT TTATGGTACG ATCAATGGCT ATACTATGAC ,3180 

TGGTATGCGC GTACCATATG CAATGGCTGA AAGAAAATTA TTGCCATTTA GCCATTTATT 3240 

CGCAAAATTA ACAAAATCTG GCGCACCATG GTTTGGCGCA ATTATACAAC TTATAATCGC 3300 

TATCATCATG ATGTCAATGG GAGCATTTGA TACAATTACA AATATGTTAA TCTTTGTTAT 33 60 

TTGGTTGTTC TATTGTATGT CATTTGTTGC GGTAATAATT TTAAGAAAAC GTGAACCAAA 3420 

TATGGAACGA CCATATAAAG TACCGTTATA TCCGATCATA CCTTTAATTG CTATTTTGGC 3480 

AGGATCATTT GTATTAATTA ATACACTGTT TACACAATTT ATATTAGCAA TCATTGGAAT 3540 

TCTAATAACA GCACTTGGTA TACCAGTTTA TTACTATAAA AAGAAACAAA AAGCAGCATA 3600 

AGGTAAGATA ACTAGCATTG AGAATAAATG GATGGACTAC TAATAAATTT AAAGTTTTAC 3660 

ACATTAAAAT CAAAAACCAT TCAATTATTC TATGGAACAG ACAAATTTCT GTTATGGAAT 3720 

TTGTCTGTTT TTCAAAAGTA TAGGGAGGCA AATAGAGATG GAAAAGCCGT CAAGAGAGGC 3780 

ATTTGAAGGC AATAATAAGT TGTTAATAGG AATTGTTCTA AGTGTAATAA CGTTTTGGCT 3840 

25 ATTTGCACAA TCATTGGTTA ATGTTGTACC AATACTTGAA GATAGTTTCA ATACAGATAT 3900 

TGGAACGGTT AATATCGCCG TTAGTATAAC TGCTTTATTT TCAGGAATGT TTGTAGTAGG 3960 

/ 

AGCAGGTGGT CTTGCTGATA AATATGGCAG AATTAAACTC ACGAACATTG GTATTATCTT 4020 

30 AAATATATTA GGTTCATTAT TAATCATTAT TTCAAATATT CCTTTATTAC TTATTATAGG 4080 

AAGATTAATT CAAGGACTTT CAGCAGCATG TATTATGCCT GCAACTTTGT CTATTATTAA 4140 

GTCATATTAC ATTGGGAAAG ATAGACAACG CGCTTTAAGT TATTGGTCAA TTGGCTCATG 4200 

GGGCGGCTCT GGTGTTTGTT CATTTTTTGG AGGTGCAGTT GCAACGCTTT TAGGTTGGCG 4260 

TTGGATTTTC ATCCTATCAA TTATAATTTC ATTAATTGCA CTGTTTCTTA TTAAAGGCAC 4320 

ACCTGAAACT AAATCTAAAT CGATTTCTCT AAATAAATTT GACATTAAAG GTCTGGTTCT 4 380 

TTTAGTCATT ATGCTCCTCA GTTTAAATAT TTTAATTACT AAAGGATCAG AATTAGGTGT 4440 

AACCTCACTT CTTTTTATTA CTTTATTAGC TATTGCAATT GGATCTTTTA GTTTATTTAT 4500 

AGTTCTTGAA AAGCGTGCTA CAAATCCTTT AATCGATTTT AAATTATTTA AAAATAAAGC 4560 

TTACACAGGT GCAACAGCTT CAAACTTTTT GTTAAATGGT GTTGCAGGAA CATTAATAGT 4620 

AGCCAACACA TTTGTTCAAA GAGGTTTAGG ATATTCTTCA TTGCAAGCAG GAAGTTTATC 4680 

50 AATCACTTAT TTAGTAATGG TACTAATTAT GATTCGTGTT GGTGAAAAGT TACTTCAAAC 474 0 

ACTCGGATGC AAGAAACCAA TGTTAATTGG AACAGGAGTT CTTATTGTCG GAGAATGTCT 4800 



55 



35 



45 
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ATTCTTTGGT 

TGCACCGTTA 
TGGAGCATTT 
CATTTATACA 
TCGTTATCAT 
TTAAATTGAA 
TGTCATTATT 
TTATAATTAT 
TATTCTGGAG 
GATAGATTGC 
GAGCATTACA 
CAACAAGATA 
ATTCGTAAAT 
TTCAATACAT 
AAACAACAAA 
AATTCACAAA 
CAAGCTAATT 
GCATCTCAAA 
AAAGAACAAA 
AATAGAGCGG 
GATAATGGTA 
CATGATTATC 
GGCATTTTTG 
TTGCAACTTG 
CGTCAACAGA 
GCAGAGCCTA 
GCTAATGATG 
CCATATAATT 
GCTCTTATGA 



TTAGGACTAG 
GAAAAAGTAG 
GGCGTCGCAT 
GGTGcAATGa 
TTTGtTACTT 
ATCATACAAG 
TTAAATGAAC 
TATAAATTGT 
CATAAATAAA 
GAAATTGTAT 
ATACACAAGC 
AGGAGTGAAC 
ATACAGTTGG 
CACAAGCACA 
GTAATAATGA 
ATGGTCAATC 
TAGTAGATCA 
ATGTAAATAC 
GTAAGCATAA 
CTCATGTAGA 
ACGTACAACA 
GCTTTATTGA 
ATAAGATTAA 
CATACAAAGA 
CTAGCCGACG 
GATCAGTATC 
GTTCGGGCTA 
TACCAACTAC 
CAGCGAAACA 



GGATATATGC 
GCGTTGCTGC 
TGAGTGGTGC 
TTGnCATTAT 
GTGcCTAAAC 
TCGCTACAAT 
ATAGGGATTG 
CACAAATTCA 
TTGTTCAACA 
TGAATCGTCA 
AATCAAAAGT 
AATAGCTGTG 
TACATTTTCA 
TGCTGCTGAA 
ACAGACTGAG 
ATTATCTGCT 
AAAAGTAGCG 
AAAGAAAGAT 
ACAAAACGAA 
AAATCATGAA 
TGACCGAAAT 
CCGTGAAAAT 
TACGTTATTA 
ATTGGAACAA 
TTCAAATAGA 
AGACTATCAA 
TCCTGTTGGT 
ACCATGGAAT 
AACTGGAGAC 



TACACCATCA 
AGGTATCTAT 
AGTATATGCA 
GGTTaAATGC 
mAAACGACAC 
ATTAAACAAA 
GTTTTTTATT 
ATTTACCTTA 
CATAGTTGTA 
TCGTTTTAAA 
AAATACATTC 
AATTATCGTG 
ACTGTCATTG 
ACAAATCAAC 
AATCGAGAAT 
ACTCATGAAA 
CAATCATCTA 
TCGGCAACGG 
AGTCAATCTG 
GCAAATGTAG 
GAATTACAAG 
GCAGATTCTG 
GGCAGTAATG 
GCTGTTGCTT 
ATTCAAACGC 
AATGCAAATT 
ACATATATCa 
ACATTGAAGG 
GGGTACCAAT 



ACAGATACAG 
AAAATGGCTT 
ATCGTATCAA 
AGGTATGGGa 
TCAATTATGA 
AATATAAACC 
ACTCTTTTAC 
CAATATATTT 
ATGTGTTTCA 
TTTTTAAATG 
ACAACACAAC 
ATAAAATTCA 
CGACATTGGT 
CAGCAAGCGT 
CTCAAGTACA 
ATGAGCAACC 
CTACTAATGA 
CTGCGACAAC 
CTAATAAAAA 
TAACAGCTTC 
CGTTTTTTGA 
GCACATTTAA 
ATCCAATAAA 
TAATTCGTAC 
GTTCGGTTGA 
CATCATATTA 
ATGCTTCTAG 
CCTCTGACTC 
GGGTTATTAA 



CAATTGCAAA 
CTGCATTAGG 
ATATGaCAAA 
ATATTATCaT 
TAATTGAGAA 
GATTCTTATG 
GCTACTTTAT 
TGTGTTATTA 
ATACTTTTTG 
AGAATGGAAT 
AGAGACATAA 
AAAGTTTAGT 
ATTTTTAGGA 
GGTTAAACAG 
AAATTCTCAA 
AAATATTAGT 
TGAACAACCA 
ACAACCAGAT 
TGGAAACGAC 
AGATTCATCT 
TGCAAATTAT 
CTATGTAAAA 
CAATAAAGAC 
AATGCCTCAA 
GTCAAGAGCT 
TGTTGAAAAT 
TAAAGGGGCG 
AAAGGAAATT 
GTTTAATAAA 



455 
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GTAGGAAGAA 


CTGACTTTGT 


AACAGTTAAT 


TCAGATGGAA CAAATGTACA 


ATGGAGTCAT 


6720 




GGAGCAGGAG 


CAGGTGCAAA 


TAAACCACTT 


CAACAAATGT GGGAATATGG 


AGTAAATGAT 


6760 


5 


CCTCATCGTT 


CACATGACTT 


TAAAATAAGA 


AATAGAAGTG 


GCCAAGTAAT 


ATATGACTGG 


6840 




CCAACTGTCC 


ATATTTATTC 


TTTAGAAGAT 


TTATCTAGAG 


CGAGTGATTA 


TTTTAGTGAA 


6900 




GCTGGAGCGA 


CACCTGCTAC 


TAAAGCTTTT 


GGTAGACAAA ATTTTGAATA 


TATTAATGGT 


6960 


10 


CAAAAACCTG 


CTGAATCACC 


GGGTGTTCCT 


AAAGTTTATA 


CTTTCATCGG 


TCAAGGTGAT 


7020 




GCAAGTTATA 


CAATTTCATT 


TAAAACACAA 


GGTCCAACTG 


TTAATAAATT 


GTACTATGCA 


7080 


15 


GCAGGTGGGC 


GTGCTTTAGA GTACAATCAA 


TTATTTATGT 


ACAGTCAACT 


ATACGTCGAA 


7140 


TCAACGCAAG 


ACCATCAACA 


ACGTCTTAAT 


GGTTTAAGAC 


AAGTGGTTAA 


TCGTACATAT 


7200 




CGCATAGGTA 


CAACTAAACG 


TGTAGAAGTG 


AGTCAAGGAA ATGTACAAAC 


GAAAAAGGTA 


7260 


20 


TTAGAAAGTA 


CAAACCTAAA 


TATAGATGAT 


TTTGTTGATG 


ATCCTTTAAG 


TTATGTTAAG 


7320 




ACGCCGAGTA 


ATAAAGTGTT 


AGGATTTTAT 


TCGAATAATG 


CAAATACTAA 


TGCTTTTAGA 


7380 




CCGGGTGGAG 


CCCAACAATT 


AAATGAATAT 


CAATTAAGTC 


AATTATTTAC 


TGATCAAAAA 


7440 


25 


TTACAAGAAG 


CAGCAAGAAC 


TAGAAACCCA 


ATAAGATTAA 


TGATTGGTTT 


CGACTATCCT 


7500 




GATGCTTATG 


GTAATAGTGA 


AcTTTAGTTC 


CTGTTAACTT 


AACGGTATTA 


CCTGAAATCC 


7560 




AACATAATAt 


TaAATTCTTT 


AAAAATGACG 


ATACTCAAAA 


TATTGCTGAA 


AAACCATTTT 


7620 


30 


CAAAACAAGC 


TGGGCATCCA 


GTTTTCTATG 


TATATGCAGG 


TAACCAAGGG 


AATGCTTCCG 


7680 




TGAATTTAGG 


TGGTAGCGTA 


ACATCTATTC 


AACCATTACG 


TATTAATTTA 


ACAAGTAATG 


7740 




AGAATTTTAC 


AGATAAAGAT 


TGGCAAATTA 


CAGGTATTCC 


GCGTACATTA 


CACATTGAAA 


7800 


35 


ACTCGACAAA 


TAGACCTAAT 


AATGCCAGAG 


AACGCAATAT 


TGAACTTGTT 


GGTAACTTAT 


7860 




TACC&GGGGA TTACTTTGGA ACGATACGTT 


TTGGACGTAA 


AGAACAATTA 


TTCGAAATTC 


7920 


40 


GTGTTAAACC 


ACATACACCA 


ACAATTACAA 


CGACAGCTGA 


GCAATTAAGA 


GGTACAGCAT 


7980 


TACAAAAAGT 


GCCTGTTAAT 


ATTTCGGGAA 


TACCGTTGGA 


TCCATCGGCA 


TTGGTTTATT 


8040 




TAGTTGCACC 


AACAAATCAA 


ACTACGAATG 


GTGGTAGTGA 


GGCAGATCAA 


ATACCATCTG 


8100 


45 


GTTATACGAT 


ACTTGCGACT 


GGTACACCTG 


ATGGGGTGCA 


TAATACAATT ACTATACGAC 


■ a n c a 
OloU 




CGCAAGATTA 


TGTTGTATTC 


ATACCACCTG 


TAGGTAAACA 


AATTAGAGCA 


GTAGTTTATT 


8220 




ATAATAAAGT 


AGTTGCATCT 


AATATGAGTA 


ATGCTGTTAC 


TATTTTGCCA 


GATGACATTC 


8280 


SO 


CACCAACAAT 


CAATAATCCT 


GTTGGAATAA 


ATGCCAAATA 


CTATCGAGGC 


GACGAAkCAA 


8340 




CTTTACAATG 


GGTGTCTCTG 


ATAGACATTC 


TGGTATAAAA AATACAACTA 


TTACGACATT 


8400 



55 
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TACAGGTAGA GTGAGTATGA ATCAGGCATT TAACAGTGAT ATTACATTTA AAGTGTCAGC 8520 

GACAGaCAAT GTCAATAATA CGACAAATGA TAGTCAATCT AAACATGTTT CAATTCATGT 8580 

5 AGGTAAAATT AGTGAAGATG CTCATCCGAT TGTATTAGGA AATACTGAGA AAGTTGTAGT 8640 

AGTCAATCCG ACTGCTGTAT CTAATGATGA AAAGCAAAGC ATAATTACTG CCTTTATGAA 8700 

TAAAAACCAA AATATAAGAG GATATTTAGC ATCAACTGAT CCAGTAACTG TCGATAATAA 8760 

10 

TGGTAATGTC ACATTACATT ACCGTGATGG CTCATCGACA ACGCTTGATG CTACAAATGT 8820 

GATGACATAC GAACCAGTTG TGAAACCTGA ATACCAAACT GTCAATGCTG CTAAAACAGC 8880 

AACGGTAACG ATTGCTAAAG GACAATCATT TAGTATTGGT GATATTAAAC AATATTTTAC 8940 

15 

TTTAAGTAAT GGACAACCTA TTCCAAGTGG CACATTTACA AATATTACAT CTGATAGAAC 9000 

TATTCCAACT GCACAAGAAG TTAGTCAAAT GAACGCAGGC ACGCAGTTAT ACCATATAAC 9060 

2Q TGCTACAAAT GCGTATCATA AAGATAGTGA AGACTTCTAT ATTAGTTTGA AAATCATCGA 9120 

TGTGAAACAA CCAGAAGGCG ATCAACGTGT ATATCGTACA TCAACATATG ATTTAACTAC 9180 

TGATGAAATC TCAAAAGTAA AACAAGCATT TATTAATGCA AATAGAGATG TAATTACGCT 924 0 

25 TGCCGAAGGT GATATTTCAG TTACAAATAC ACCTAATGGT GCTAATGTAA GTACTATTAC 9300 

AGTAAATATT AATAAAGGTC GATTAACGAA ATCATTCGCG TCAAACCTAG CTAATATGAA 93 60 

TTTCTTGCGT TGGGTTAATT TCCCACAAGA TTATACAGTG ACATGGACGA ATGCAAAAAT 9420 

30 TGCAAACAGA CCAACAGATG GTGGTTTATC ATGGTCTGAT GACCATAAAT CTTTAATTTA 94 80 

TCGTTATGAT GCTACATTAG GTACTCAAAT TACGACGAAT GATATTTTAA CAATGTTAAA 9540 

AGCAACAACT ACAGTGCCTG GATTGCGAAA TAACATTACT GGTAATGAAA AATCACAAGC 96 00 

35 

AGAAGCTGGC GGAAGACCTA ACTTTAGAAC GACTGGTTAT TCACAATCAA ATGCGACAAC 9660 

TGATGGTCAA CGTCAATTTA CGTTGAATGG TCAAGTGATT CAAGTGTTAG ACATCATCAA 9720 

CCCTTCAAAC GGTTATGGTG GGCAACCTGT TACAAATTCA AATACTCGTG CAAACCATAG 9780 

40 

TAACTCAACT GTTGTTAACG TAAACGAACC GGCAGCTAAT GGTGcTGGCG CATTTACAAT 9840 

TGACCACGTT GTAAAAAGTA ATTCTACACA TAATGCAAGT GATGCAGTTT ATAAAGCACA 9900 

45 GTTATACTTA ACGCCATATG GTCCAAAACA ATATGTTGAA CATTTAAATC AAAATACAGG 9960 

AAATACTACT GACGCTATTA ACATTTATTT TGTACCAAGT GACTTAGTGA ATCCAACAAT 10020 

TTCAGTAGGT AATTACACTA ATCATCAAGT GTTCTCAGGT GAAACATTTA CAAATACTAT 10080 

50 TACAGCGAAT GATAACTTTG GTGTGCAATC TGTAACTGTA CCAAATACAT CACAAATTAC 10140 

AGGTACTGTT GATAATAACC ATCAACATGT TTCTGCAACG GCACCAAATG TGACATCAGC 10200 

55 
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GTTCAATGTA ACAGTGAAAC 


CTTTGCGTGA 


TAAATATCGA GTTGGTACTT 


CATCAACGGC 


10320 




TGCTAATCCT 


GTGAGAATTG 


CCAATATTTC 


GAATAATGCG 


ACAGTATCAC 


AAGCTGATCA 


10380 


5 


AACGACAATT 


ATTAATTCGT 


TAACGTTTAC 


TGAAACAGTA 


CCAAATAGAA 


GTTATGCAAG 


10440 




AGCAAGTGCG 


AATGAAATCA 


CTAGTAAAAC 


AGTTAGTAAT 


GTCAGTCGTA 


CTGGAAATAA 


10500 




TGCCAATGTg CACAGTAACT GTTACTTATC AAGATGGAAC AACATCAACA GTGACTGTAC 


10560 


10 


CTGTAAAGCA 


TGTCATTCCA 


GAAATCGTTG 


CACATTCGCA 


TTACACTGTA 


CAAGGCCAAG 


10620 




ACTTCCCAGC 


AGGTAATGGT 


TCTAGTGCAT 


CAGATTACTT 


TAAGTTATCT 


AATGGTAGTG 


10680 


15 


ACATTGCAGA 


TGCAACTATT 


ACATGGGTAA 


GTGGACAAGC 


GCCAAATAAA 


GATAATACAC 


10740 


GTATTGGTGA 


AGATATAACT 


GTAACTGCAC 


ATATCTTAAT 


TGATGGCGAA 


ACAACGCCGA 


10800 




TTACGAAAAC AGCAACATAT AAAGTAGTAA GAACTGTACC GAAACATGTC TTTGAAACAG 


10B60 


20 


CCAGAGGTGT 


TTTATACCCA 


GGTGTTTCAG 


ATATGTATGA 


TGCGAAACAA 


TATGTTAAGC 


10920 




CAGTAAATAA 


TTCTTGGTCG 


ACAAATGCGC 


AACATATGAA 


TTTCCAATTT 


GTTGGAACAT 


10980 




ATGGTCCTAA 


CAAAGATGTT 


GTAGGCATAT 


CTACTCGTCT 


TATTAGAGTG 


ACATATGATA 


11040 


25 


ATAGACAAAC 


AGAAGATTTA 


ACTATTTTAT 


CTAAAGTTAA 


ACCTGACCCA 


CCTAGAATTG 


11100 






TGTGACATAT 


AAAGCAGGTC 


TTACAAACCA 


AGAAATTAAA 


GTTAATAACG 


11160 




TATTAAATAA 


CTCGTCAGTA 


AAATTATTTA 


AAGCAGATAA 


TACACCATTA 


AATGTCACAA 


11220 


30 


ATATTACTCA 


TGGTAGCGGT 


TTTAGTTCGG 


TTGTGACAGT 


AAGTGACGCG 


TTACCAAATG 


11280 




GCGGAATTAA 


AGCAAAATCT 


TCAATTTCAA 


TGAACAATGT 


GACGTATACG 


ACGCAAGACG 


11340 




AACATGGTCA 


AGTTGTTACA 


GTAACAAGAA 


ATGAATCTGT 


TGATTCAAAT 


GACAGTGCAa 


11400 


35 


CAGTAACAGT 


GACACCACAA 


TTACAAGCAA 


CTACTGAAGG 


CGCTGTATTT 


ATTAAAGGTG 


11460 




gcgaCggttt 


TGATTTCGGA 


CACGTAGAAA 


GATTTATTCA AAACCCGCCA 


CATGGGGCAA 


11520 


40 


CGGTTGCATG 


GCATGATAGT , CCAGATACAT GGAAGAATAC AGTCGGTAAC ACTCATAAAA 


11580 


CTGCGGTTGT 


AACATTACCT 


AATGGTCAAG 


GTACGCGTAA 


TGTTGAAGTT 


CCAGTCAAAG 


1164 0 




TTTATCCAGT 


TGCTAATGCA 


AAGGCGCCAT 


CACGTGATGT 


GAAAGGTCAA 


AATTTGACTA 


11700 


45 


ATGGAACGGA 


TGCGATGAAC 


TACATTACAT 


TTGATCCAAA TACAAACACA AATGGTATCA 


11760 




CTGCAGCATG 


GGCAAATAGA 


CAACAACCAA ATAACCAACA AGCAGGCGTG 


CAACATTTAA 


11820 




ATGTCGATGT 


CACATATCCA 


GGTATTTCAG 


CTGCTAAACG 


AGTTCCTGTT 


ACTGTTAATG 


11880 


50 


TATATCAATT 


TGAATTCCCT 


CAAACTACTT 


ATACGACAAC 


GGTTGGAGGC 


ACTTTAGCAA 


11940 




GTGGTACGCA 


AGCATCAGGA 


TATGCACATA 


TGCAAAATGC 


TACTGGTTTA 


CCAACAGATG 


12000 



55 



458 



EP 0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



50 



TGAATAAACC 

ATACTTTTGC 

CAACTGTGAC 

TGAATACACA 

GTAACGTTGT 

CTGCAGCAAC 

CTTTCAACCC 

GTGATGAGCA 

CTAAGATTTG 

TTAATCCAAC 

ATAGTAAGAC 

CTGACTATGT 

AACCAAATTC 

ATCCAAGTAC 

ATTATGGTTC 

GTACTGCAAC 

CGACGATTCC 

TTTTCACAAA 

TAAGCACTGA 

CGCAACAACA 

CAACACCACA 

ATCAAGCTAA 

ATAACTTACA 

TTGATAACTA 

TTATTGACAA 

ACGCATTAAC 

AGCAAGCAGT 

TTACTGCTTA 

GCGCTAATGC 



GAATGTGGCT AAAGTCGTTA 
AACATCTTTA CCAGCGAAAT 
TGAAACAGCG GCAGGAGCGA 
TGCCGGTAAC GTAACGACAT 
GACGACATTT ACACGTCGCA 
TGTAGCAGGT ATTGCTGGAA 
TGCTGATACA ATTCAAGTTG 
ACGTAGTGAT GATTTCACAG 
GCAAAATGGT CATATTGATA 
TCAAGCAATG GATATTGCTT 
AATTAATGTT GTTCGTGGTC 
AACGTTAGAT GCACAAACTG 
ATCAATCACA ATTACTCCGA 
ATTAACTGCA CCGGCAGCTC 
AAATGTAACA GCAGCTGAAA 
GATTAAAAAT GGCACAGCAA 
TGTGACAGTA ACTTACAATG 
AGCGGATAAA CGTGAGTTAA 
AGGTAAAAAG CCAGGTACAA 
AATCAATACT GCGAAAACAG 
ACAAGTTTCT GACGCACTAA 
AGCATTACTT CAAAATAAAG 
AAGTTCTGTG AACCAAGTAC 
TAATGCGAAG AAGCGTGAAG 
TGGCGATGCA ACTGCACAAC 
AGCATTAAAC CAAGCGAAAC 
GCAACAATTG AATCGCACAG 
CAATAATTCG ATTCGTGCAC 
TATTATTCAA AAGCCAATAA 



ACGCAAAATA 

TTGTAGTAAA 

TTACAATTGC 

ACGCTGATAA 

ATAATACGAG 

CTAATAATGG 

TTGCAACGCA 

TTGTCGCACC 

TCACGCCTAA 

ACACTGAAAA 

AAAATAATCA 

GTAAAGTGAC 

AAGCAGGTAC 

ATACTGTCAA 

TTAACAATGC 

TGCCTACTAA 

ATGGTAGTAC 

TCACAGCTAA 

TTACGCAGTA 

AAGCACAACA 

CTAAAGTTCG 

AAGATAATAG 

CATCAACTGC 

CAGAAACTGA 

AAATTTCAGA 

ATGATTTAAC 

GTACAACGAC 

TTCAAAGTGA 

GAACAGTACA 



TGACGTCATC 
AGATGTGCAA 
ACCTGGAGCA 
ATTAGTTATT 
TCCATGGGTG 
TATTACTGTT 
AGGAAGCGGA 
ACAACCGAAC 
TAATCCATCA 
AGTGGGTAAT 
ATGGACAATT 
GTTCAATGCC 
AGGTCACTCA 
CACAACTGAA 
AGTTCaAGTT 
TTTAGCTGGT 
TGAAGAAGTA 
AAATCATTTA 
CAATAATGCA 
AGTGATTAAT 
TGCAGCACAA 
CCAATTAGTA 
TGGTATGACG 
AATAACTGCA 
TGAAAAACAT 
TGCAGATACA 
TGGTAAGAAG 
CTTAACAAGT 
AGAAGTGCAA 



TATAACGGAC 
CCAGCGAAAC 
AACCAAACAG 
AAACGTAATG 
AAAGAAGCAT 
GCAGCAGGTA 
GAGACAGTGA 
CAAGCGACTA 
GGACATTTAA 
GGTGCAGAAC 
GCGAATAAGC 
AATACTATAA 
GTAAGTAGTA 
ATTGTGAAAG 
GCTAATAAAC 
GGTAGCACAA 
CAAGAGTCCA 
GATGATCCAG 
ATGCATAATG 
AATGAGCGTG 
ACTAAGATTG 
ACGTCTAAAA 
CAACAAAGTA 
GCTCAACGTG 
CGTGTCGATA 
CATGCCTTAG 
CCGGCAAGTA 
GCTAAAAATA 
TCTGCGTTAA 



12120 
12180 
12240 
12300 
12360 
12420 
12460 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
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CTGATAATAG TGCTTTAAAA ACTGCTAAGA CGAAACTTGA TGAAGAAATC AATAAATCAG ■ 13 920 

TAACTACTGA TGGTATGACA CAATCATCAA TCCAAGCATA TGAAAATGCT AAACGTGCGG 13980 

5 GTCAAACAGA ATCAACAAAT GCACAAAATG TTATTAACAA TGGTGATGCG ACTGACCAAC 14 040 

AAATTGCCGC AGAAAAAACA AAAGTAGAAG AAAAATATAA TAGCTTAAAA CAAGCAATTG 14100 

CTGGATTAAC TCCAGACTTG GCACCATTAC AAACTGCAAA AACTCAGTTG CAAAATGATA 14160 

10 

TTGATCAGCC AACGAGTACG ACTGGTATGA CAAGCGCATC TATTGCAGCA TTTAATGAAA 14220 

AACTTTCAGC AGCTAGAACT AAAATTCAAG AAATTGATCG TGTATTAGCC TCACATCCAG 14280 

ATGTTGCGAC AATACGTCAA AACGTGACAG CAGCGAATGC CGCTAAATCA GCACTTGATC 14340 

15 

AAGCACGTAA TGGCTTAACA GTCGATAAAG CGCCTTTAGA AAATGCGAAA AATCAACTAC 14400 

AACATAGTAT TGACACGCAA ACAAGTACAA CTGGTATGAC ACAAGACTCT ATAAATGCAT 14460 

2Q ACAATGCGAA GTTAACAGCT GCACGTAATA AGATTCAACA AATCAATCAA GTATTAGCAG 14520 

GTTCACCGAC TGTAGAACAA ATTAATACAA ATACGTCTAC AGCAAATCAA GCTAAATCTG 14580 

ATTTAGATCA TGCACGTCAA GCTTTAACAC CAGATAAAGC GCCGCTTCAA ACTGCGAAAA 14 640 

25 CGCAATTAGA ACAAAGCATT AATCAACCAA CGGATACAAC AGGTATGACG ACCGCTTCGT 14700 

TAAATGCGTA CAACCAAAAA TTACAAGCAG CGCGTCAAAA GTTAACTGAA ATTAATCAAG 14 760 

TGTTGAATGG CAACCCAACT GTCCAAAATA TCAATGATAA AGTGACAGAG GCAAACCAAG 14 820 

30 CTAAGGATCA ATTAAATACA GCACGTCAAG GTTTAACATT AGATAGACAG CCAGCGTTAA 14 880 

CAACATTACA TGGTG CAT CT AACTTAAACC AAGCACAACA AAATAATTTC ACGCAACAAA 1494 0 

TTAATGCTGC TCAAAATcAT GctGCGCTTG AAACAATTAA GTCTAACATT ACGGCTTTAA 15000 

35 

ATACTGCGAT GACGAAATTA AAAGACAGTG TTGCGGATAA TAATACAATT AAATCAGATC 15060 

AAAATTACAC TGACGCAACA CCAGCTAATA AACAAGCGTA TGATAATGCA GTTAATGCGG 15120 

CTAAAGGTGT CATTGGAGAA ACGACTAATC CAACGATGGA TGTTAACACA GTGAACCAAA 15180 

40 

AAGCAGCATC TGTTAAATCG ACGAAAGATG CTTTAGATGG TCAACAAAAC TTACAACGTG 15240 

CGAAAACAGA AGCAACAAAT GCGATTACGC ATGCAAGTGA TTTAAACCAA GCACAAAAGA 15300 

4S ATGCATTAAC ACAACAAGTG AATAGTGcAC AAAACGTGCA AGCAGTAAAT GATATTAAAC 15360 

AAACGACTCA AAGCTTAAAT ACTGCTATGA CAGGTTTAAA ACGTGGCGTT GCTAATCATA 15420 

ACCAAGTCGT ACAAAGTGAT AATTATGTCA ACGCAGATAC TAATAAGAAA AATGATTACA 15480 

SO ACAATGCATA CAACCATGCG AATGACATTA TTAATGGTAA TGCACAACAT CCAGTTATAA 15540 

CACCAAGTGA TGTTAACAAT GCTTTATCAA ATGTCACAAG TAAAGAACAT GCATTGAATG 15600 
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ATTTAAATAA TGCACAACGT CAAAACTTAC 
ATGCAGTTAA TACAATTAAG CAAAATGCAA 
5 GACAAGCTGT TGCAGATAAA GATCAAGTGA 

CAGCTAAACA AAATGCATAT AACAGTGCAG 
CAACAAATCC AACGATGTCT GTTGATGATG 

70 

ATAAAAATGC ATTAAATGGT TATGAAAAAT 
CAATTGATGC ATTACCACAT TTAAATAATG 
ATGCTGCATC AAATATTGCT GGCGTAAATA 

1S 

CAkCGATGGg TAACTTGCAA GGTGCAATCA 
ACTATCAAGA TGCGACACCT AGTAAGAAAA 

2Q AAGATATTTT AAATAAATCA AATGGTCAAA 

TGAATCAAGT GAATTCTGCT AAAAATAACT 
nCAAaCAGCA AAACAGCAGT TAAATAATAT 

25 TTTAACAAAC CAAATTAATA GTGGTACTAC 

TGCCAATACA TTAGATCAAG CCATGAATAC 
GACTAAAGCA AGTGAAGATT ACGTAGATGC 

30 CGCAGTAGCT GCTGCTGAAA CGATTATTAA 

TACGATTACA CAAAAAGCAG AGCAAGTGAA 
AAACTTAGCT GCTGCAAAAC AAAATGCGAA 

35 

AGATGCTCAA AAGAACAATT TGATTAGTCA 
TGAXACTGTA AAACAAAATG CGCAACATCT 
TATTAACAAC GAATCTCAAG TGAAATCATC 

40 

ACAACAAGAG TATGATAATG CTATTACTGC 
TCCAAACACT GCGCAAAATG CAGTTGAAGC 

45 TGCATTGAAT GGTGATGCAA AATTAATTGC 

TACTTTAACG CATATCACTA CAGCTCAACG 
TACAAACTTA GCTGGTGTTG AATCTGTTAA 

SO GGGTAACTTA CAAACGGCTA TCAACGATAA 

GGATGCTGAT GAGCAAAAAC GTAATGCATA 

S5 



AATCGCAAAT TAATGGTGCG CATCAAATTG 15720 

CAAACTTGAA TAGTGCAATG GGTAACTTAA 15780 

AACGTACAGA AGATTATGCG GATGCAGATA 15840 

TTTCAAGTGC CGAAACAATC ATTAATCAAA 15900 

TTAATCGTGC AACTTCAGCT GTTACTTCTA 15960 

TAGCACAATC TAAAACAGAT GCTGCAAGAG 16020 

CACAAAAAGC AGATGTTAAA TCTAAAATTA 160 BO 

CTGTTAAACA ACAAGGTACA GATTTAAATA 16140 

ATGATGAACA AACGACGCTT AATAGTCAAA 16200 

CAGCATACAC AAATGCGGTA CAAGCTGCGA 16260 

ATAAAACGAA AGATCAAGTT ACTGAAGCGA 16320 

TAGATGGTAC GCGTTTATTA GATCAAGCGA 16380 

GACGCATTTA ACAACTGCAC AAAAAACGAA 16440 

TGTCGCTGGT GTTCAAACGG TTCAATCAAA 16500 

GTTAAGACAA AGTATTGCCA ACAAAGATGC 16560 

TAATAATGAT AAGCAAACAG CATATAACAA 16620 

TGCTAATAGT AATCCAGAAA TGAATCCAAG 16680 

TAGTTCTAAA ACGGCACTTA ACGGTGATGA 16740 

AACGTACTTA AACACATTGA CAAGTATTAC 16800 

AATTACTAGT GCGACAAGAG TGAGTGGTGT 16860 

AGACCAAGCT ATGGCTAGCT TACAGAATGG 16920 

TGAGAAATAT CGTGATGCTG ATACAAATAA 16980 

AGCGAAAGCG ATTTTAAATA AATCGACAGG 17040 

AGCATTACAA CGTGTTAATA ATGCGAAAGA 17100 

AGCTCAAAAC GCAGCGAAAC AACATTTAGG 17160 

TAATGATTTA ACAAATCAAA TTTCACAAGC 17220 

ACAAAATGCG AATAGTTTAG ATGGTGCTAT 17280 

GTCAGGAACA TTAGCGAGCC AAAACTTCTT 17340 

CAATCAAGCT GTATCAGCAG CCGAAACCAT 174 00 
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TGTTAATAAT G CG AAACATG CATTAAATGG TACGCAAAAC TTAAACAATG CGAAACAAGC 17520 

AGCGATTACA GCAATCAATG GCGCATCTGA TTTAAATCAA AAACAAAAAG ATGCATTAAA 17580 

AGCACAAGCT AATGGTGCTC AACGCGTATC TAATGCACAA GATGTACAGC ACAATGCGAC 1764 0 

TGAACTGAAC ACGGCAATGG GCACATTAAA ACATGCCATC GCAGATAAGA CGAATACGTT 17700 

AGCAAGCAGT AAATATGTTA ATGCCGATAG CACTAAACAA AATGCTTACA CAACTAAAGT 17760 

TACCAATGCT GAACATATTA TTAGCGGTAC GCCAACGGTT GTTACGACAC CTTCAGAAGT 17820 

AACAGCTGCA GCTAATCAAG TAAACAGCGC GAAACAAGAA TTAAATGGTG ACGAAAGATT 17880 

ACGTGAAGCA AAACAAAACG CCAATACTGC TATTGATGCA TTAACACAAT TAAATACACC 17940 

TCAAAAAGCT AAATTAAAAG AACAAGTGGG ACAAGCCAAT AGATTAGAAG ACGTACAAAC IB 000 

TGTTCAAACA AATGGACAAG CATTGAACAA TGCAATGAAA GGCTTAAGAG ATAGTATTGC 18060 

TAACGAAACA ACAGTCAAAA CAAGTCAAAA CTATACAGAC GCAAGTCCGA ATAACCAATC 18120 

AACATATAAT AGCGCTGTGT CAAATGCGAA AGGTATCATT AATCAAACTA ACAATCCGAC 18180 

TATGGATACT AGTGCGATTA CCCAAGCTAC AACACAAGTG AATAATGCTA AAAATGGTTT 18240 

25 AAACGGTGCT GAAAACTTAA GAAATGCACA AAACACTGCT AAGCAAAACT TAAATACATT 18300 

ATCACACTTA ACAAATAACC AAAAATCTGC CATCTCATCA CAAATTGATC GTGCAGGTCA 18360 

TGTGAGTGAG GTAACTGCTA CTAAAAATGC AGCAACTGAG TTGAATACGC AAATGGGTAA 18420 

30 CTTGGAACAA GCTATCCATG ATCAAAACAC AGTTAAACAA AGTGTTAAAT TTACTGATGC 184 SO 

AGATAAAGCT AAACGTGATG CGTATACAAA TGCGGTAAGC AGAGCTGAAG CAATTCTGAA 18540 

TAAAACGCAA GGTGCAAATA CGTCTAAACA AGATGTTGAA GCGGCTATTC AAAATGTTTC 18600 

AAGTGCTAAA AATGCATTGA ATGGTGATCA AAACGTTACA AATGCGAAGA ATGCAGCTAA 18660 

AAATGCATTA AATAACTTAA CGTCAATTAA TAATGCACAA AAACGTGACT TAACAACTAA 18720 

AATTGATCAA GCAACAACTG TAGCTGGTGT TGAAGCTGTA TCTAATACGA GTACACAATT 18780 

GAAtACAGCG ATGGCTAACT TGCAAAATGG TATTAATGAT AAAACAAATA CACTAGCAAG 18840 

TGAAAACTAT CATGATGCTG ATTCAGATAA GAAAACTGCT TATACTCAAG CCGTTACGAA 18900 

CGCAGAAAAT ATTTTAAATA AAAATAGTGG ATCAAATTTA GACAAAACTG CCGTTGAAAA 18960 

CGCGTTGTCA CAAGTTGCTA ATGCGAAAGG TGCCCTAAAT GGTAACCATA ATTTAGAGCA 19020 

AGCTAAATCA AATGCAAACA CTACTATAAA CGGACTTCAA CATTTAACAA CTGCTCAAAA 19080 

SO AGATAAATTG AAACAACAAG TGCAACAAGC ACAAAATGTT GCAGGTGTAG ATACTGTTAA 1914 0 

ATCAAGTGCC AACACATTAA ATGGTGCTAT GGGTACGTTA AGAAATAGCA TACAAGATAA 19200 
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TAACAATGCT GTTGATAGTG 
TGCTAATGCA ATTAACCAAA 
TACACATAAT TTAACGCAAG 
CTTAAATAAA GCGCAAAAAG 
AAATGTAACA AGTATCCAAC 
ACATGGTATT GATGATGAAA 
AAGTAAGAAA ACTGCTTATG 
AACAGGTTCA AATTCAGATA 
GAAAGATGCA TTGAATGGTG 
CTTAGGCACT TTAAACCATA 
TCAAGCGACG ACTGTTGATG 
CGCAATGAAT AGCTTACAAG 
TTATCTTGAT GCGGATGAAT 
AGGCATTTTA AATAAACAAA 
AAATGCAGTT ACAAGAGCGA 
AACTTCAGCA ACAAATACGA 
CTTGAAGCAT CAAGTTGAaC 
AGGTAATACG TTAAATACTG 
GACGAAAACA AGTCAAAATT 
TGCTGTAAAT AATGCAAATG 
TGCGATTAAT GGCATGGCAA 
AAACTTAGCT CAAGCTAAAA 
CCAAAAACAA AAAGATGCAT 
AAATAACGTT CAACACACTG 
TATTGCTGAT AAAGAAAGAA 
ACGTCAAGCG TATGATTCAA 
TGCGACATTA ACAGTCAATG 
AGCATTAAAT GGTGATAACA 
CGGCTTAGCA CAATTGAATA 



CTAATGGTGT CATTAATGCA 
TCGCTACACA AGTGACATCA 
CGAAACAAAC AGCAACAAAT 
ATGCGTTAAA AGCACAAGTT 
AAACTGCAAA TGAACTTAAT 
ATGCAACAAA ACAAACTCAA 
ATCAAGCTGT AGCTGCTGCG 
AAGCAGCAGT TGACCGTGCA 
ATGCAAAACT GGCAGAAGCG 
TTACGAATGC ACAACGTACT 
GCGTTAATAC TGTAAAAACA 
GTTCAATCAA TGATAAAGAT 
CAAAACGAAA TGCATATACG 
CTGGTGGTAA CACATCTAAA 
AAGcGgCTTT AAATGGTGCT 
TTGATGGTTT ACCTAACTTA 
AAGCGCAAAA TGTAGCAGGT 
CCATGGGTGC ATTACGTACA 
ATCTTGATGC ATCTGACAGC 
GTGTTATTAA TGCAACGAAC 
ATCAAGTCAA TACAACAAAA 
CAAATGCGAC GAACACAATT 
TAAAAACACA AGTTAACAAT 
CAACTGAATT GAACAGTGCG 
CAAAAGCAAG CGGTAATTAT 
AAGTGACTAA CGCTGAAAAT 
ACGTAAATAG TGCGGCATCA 
ACTTACGTGT AGCGAAAGAG 
ATGCACAAAA AGCAAAATTA 



ACAAGCAATC 
ACGAAAAATG 
GCCATCGATG 
ACAAGTGCGC 
ACAGCTATGG 
AAATATCGTG 
AAAGCAATTT 
TTACAACAAG 
AAAGCGGCAG 
GACTTAGAAG 
AATGCCAATA 
GCGACATTAA 
CAAGCTGTCA 
GCAGACGTTG 
GACAACTTAA 
ACACAATTAC 
GTAAATGGTG 
AGTATCGAAA 
AACAAAAATA 
AATCCAAATA 
GCAGCGTTAA 
AACAACGCAC 
GCACAACGTG 
ATGACAGCAC 
GTCAATGCTG 
ATCATTAGTG 
CAAGTCAATG 
CATGCCAACA 
AAAGAACAAG 



CAAATATGGA 
CATTAGATGG 
GTGCTACTAA 
AACGTGTTGC 
GTCAATTACA 
ACGcTGAACA 
TAAATAAACA 
TAACAAGTAC 
CTAAACAAAA 
GCCAAATCAA 
CATTAGACGG 
GAAATCAAAA 
CAGCGGCTGA 
ATAATGCATT 
GAAATGCGAA 
AAAAAGACAA 
TTAAAGATAA 
ATGATAATAC 
ATTACAATAC 
TGGATGCTAA 
ATGGTGCACA 
ATGACTTAAA 
TATcTGATGC 
TTAAAGCAGC 
ATCAAGAAAA 
GTACACCGAA 
CGGCTAAAAC 
ATACAATTGA 
TTCAAAGTGC 



19320 

19380 

19440 

19500 

19560 

19620 

19680 

19740 

19800 

19860 

19920 

19980 

20040 

20100 

20160 

20220 

20280 

20340 

20400 

20460 

20520 

20580 

20640 

20700 

20760 

20820 

20880 

20940 

21000 
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GAAAGGCTTA 
TGACGCAAGT 
CATTAATCAA 
AGTGACAACT 
TGCGAAAAAC 
GCGTAgcATT 
AGAATTAAAT 
ACAAACTCAG 
AAATGCAGCG 
TGAACAAGCA 
AAATGAAGCT 
ACAACGTACA 
AGrTAAAGCC 
TGATAAAGAC 
TGCTTATTCT 
TACACCTAAA 
AAACGGTATT 
TTCGGACTTA 
TGTATCTGCA 
TTTAAAGCGT 
TGATCJCGAAT 
TGGTACACCA 
GAATGCTAAG 
CACTGCAATT 
AGTGGGTCAA 
AAACACTGCA 
TCAAAACTAC 
AGCAAAAGCA 
AGCGAAAGAC 



AGAGATAGTA 
CCAAATAATC 
ACATCGAACC 
AAAGAACAGG 
AACTTGAATA 
GATGGTGCAA 
AACGCAATGC 
AAATACCTAG 
AAAGCAATTT 
TTGCAAAATG 
AAAGCAGCTG 
GCGTTAGACA 
AAAGCGCAAC 
ACGACGTTAC 
CAAGCAGTAA 
GCAGATGTTG 
CAmAACTTAG 
AATACAAAAC 
GCAAATGGTG 
GCCATTGCTG 
AAACGTCAAG 
ACACCAACGT 
ACGCAGTTAA 
GATGGTTTAA 
GCGACGACGT 
ATGAAAGGTC 
ACAGATGCAA 
ATCATTGGTC 
CAAGTGACAG 



TTGCGAATGA 
GTAACGAGTA 
CAACGATGGA 
CATTAAATGG 
ACTTAACATC 
CAACAGTAGC 
ATAGTTTACA 
ATGCAGAGCC 
TAACAAAAGC 
TGAACAGTAC 
CGAAACAAAC 
ATGAAATTAC 
AATTAGATGG 
AAAGTCAAAA 
ATGCAGCAGC 
AAAGAGCAAT 
ATCGTGCGAA 
mAAAAGAAGC 
TTGAACATAC 
ATAAAGCTGA 
CATATGATGA 
TAACACCAGC 
ACGGTAATCA 
CTTCTTTAAA 
TGCCAAATGT 
TACGAGATAG 
GTCAAAACAA 
AAACAACTAG 
CTAAACAACA 



AGCAACAATT 
CGACAGTGCA 
ACCAAATACT 
TGCGCGAAAC 
AATTAACAAT 
TGGTGTAAAT 
AAATGGTATC 
AAGTAAGAAA 
TAGTGGTCAA 
GAAGACGGCG 
GTTAGGTACA 
ACAAGCAACA 
TGCTATGGGT 
TTATCAAGAT 
AACTATTTTA 
GCAAGCTGTT 
ACArGCTGCT 
ATTAAAAgCA 
TGCGACTGAA 
GACAAAAGCT 
AAAAGTTACA 
AGATGTTACA 
TAATTTAGAA 
TGGTCCGCAA 
TCAAACTGTT 
CATTGCGAAT 
ACAAACTGAC 
TCCATCAATG 
AGCGTTAAAC 



AAAGCAGGTC 
GTTACTGCAG 
ATTACGCAAG 
TTAGCTCAAG 
GCACAAAAAG 
CAAGAAACTG 
AATGATGAGA 
TCAGCTTATG 
AATGTAGACA 
TTGAACGGTG 
TTAACACACA 
AATGTTGAAG 
CAATTAGAAA 
GCTGATGATG 
AATAAAACAg 
ACACAAGCAA 
AACACAGCGA 
CAAGTAACAA 
TTAAATACTG 
AGTGGTAACT 
GCTGCCGAAA 
AATGCAGCAA 
GTAGCGAAAC 
AAAGCAAAAC 
CGTGATAATG 
GAAGCAACGA 
TACAACAGTG 
AATGCGCAAG 
GGTCAAGAAA 



AAAACTACAC 

CAAAAGCAAT 

TAACATCACA 

CTAAGACAAC 

ATGCGTTAAC 

CAAAAGCAAC 

CACAAACAAA 

ATCAAGCAGT 

AAGCAGCAGT 

ATGCGAAATT 

TTAATAATGC 

GTGTTAATAC 

CATCAATTCG 

CTAAACGAAC 

CTGGCGGTAA 

ATACTGcATT 

TTACAAATGC 

GTGCAGGACG 

CGATGACAGC 

ATGTCAATGC 

ATATCGTTAG 

CGCAAGTAAC 

AAAATGCTAA 

TTAAAGAACA 

CACAAACATT 

TTAAAGCAGG 

CAGTCACTGC 

AAATTAATCA 

ACTTAAGAAC 



21120 

21180 

21240 

21300 

21360 

21420 

21480 

21540 

21600 

21660 

21720 

21780 

21840 

21900 

21960 

22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 
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AGATGCAGTG AAACGTCAAA 
AAATAATGCG GATGCaTTAA 
GAATACGATT AAGCAAGGTG 
TACAAATGCA GTGACGCAAG 
AAAAGACGGT GTCGAAACTG 
TAATCAAAAT GTTGCGAACG 
AATTAATAAT GCACAAAAAG 
AGGTGTAAAT CAAGTGTCTA 
AAATGGTATT AATGATGAAG 
AAAAGCTAAA CAACACGCAA 
AAAAGAGGCA TTAAAACAAT 
TGAGCAAAAA GCAAACAATG 
TAATGCGACA ACAAAACAAA 
GTACAATAAT GCTGTCACAA 
AGATCCGACT GTTATCAATC 
TGGTAATGAA AACCTAGAGG 
TAACTTAAAT AATGCGCAAA 
TGATGAAGCA AATCAAATTA 
GAAACAAGCG ATAGcTGACA 
TCAAGCAAAA CAACAAGCAT 
AGCTAATGGC GGCAATGCAA 
TGCAAAACAA GCATTAAATG 
ATTAATTAAT AGCTCTAATG 
TCAAAATGCA ACTACTGTAG 
CAATGCTATG ACACAATTAA 
TAACTTTGTC AATGCAGATC 
TGAAGCATTA ATTAGTGctA 
GTTAAATAAA GTTACGCAAG 
GAAACAAAAT GTTCAACATG 



TCGAAGGTGC AACGCATGTT 
ATACAGCTAT GACGAACTTG 
TTAACTTCAC TGATGCCGAC 
CTGAACAAAT TTTAAATAAA 
CGTTAGAaAA TGTACAACGT 
CTAAGACAAC TGCGAAAAAT 
AAGCATTGAA ATCACAAATT 
CAACGGCATC TGAATTAAAT 
CAGCTACAAA AGCAGCGCTT 
ATACAGCAAT TGACGGTTTA 
TGGTACAACA ATCGACTACT 
TTGATGCAGC AATGGACAAA 
ACCAAAATTA TACTGATGCA 
CTGCACAAGG TATTATTGAT 
AAGCTGCTGG ACAAGTAAGC 
CAGCGAAACA ACAAGCGTCA 
AACAAACAGT TACTGATCAA 
AGCAAAATGC GCAAAACTTA 
AAGATGCTAC GAAAGCGACA 
ATAACaCTGC TGTTACAAAT 
CACAAGCTGA AGTTGAACAA 
GTAATGCCAA CGTTCAACAT 
ACCTTAACCA AGCACAAAAA 
CTGGTGTAAA CAATGTTAAA 
AACAAGGCAT TGCAGATAAA 
CTGATAAGCA AAATGCATAT 
CGCCTGATGT TGTCGTTACA 
CTAAAAATGA TTTAAATGGT 
CTATTGATCA ATTGCCAAAC 



AATGAAGTAA 
AAAAATGGTA 
GAAGCGAAAC 
GCACAAGGTC 
GCTAAAAACG 
GCATTGAATA 
GAAGGTGCGA 
ACAGCAATGA 
AATGGTACTC 
AGCCATTTAA 
GTTGCAGAAG 
TTACGTCAAA 
AGTCAGAATA 
CAAACTACAA 
ACAACTAAAA 
CAATCATTAG 
ATTAATGGCG 
AATACAGCGA 
GTTAACTTCA 
GCTGAAAATA 
GCAATCAAAC 
GCAAAAGACG 
GACGCATTAA 
CAAACAGCAC 
GAACAAACAA 
AATCAAGCAG 
CCTAGCGAAA 
AATACAAACT 
TTAAACCAAG 



CACAAGCACA 

TTCAAGATCA 

GTAATGCATA 

CAAATACTTC 

AATTGAACGG 

ACCTAACATC 

CAACAGTTGC 

GCAACTTACA 

AAAACCTTGA 

CAAATGCACA 

CACAAGGTAA 

GTATTGCAGA 

AAAAGGATGC 

GTCCAACTTT 

ATGCATTAAA 

GTTCATTAGA 

CGCATACTGT 

TGGGTAACTT 

CTGATGCAGA 

TCATTTCAAA 

AAGTTAATGC 

AAGCAACAGC 

AACAACAAGT 

AAGAGTTAAA 

AAGCTGATGG 

TAGCGAAAGC 

TTACTGCAGC 

TAGCAACGGC 

CGCAACGTGA 



22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 

23760 

23820 

23880 

23940 

24000 

24060 

24120 

24180 

24240 

24300 

24360 

24420 

24480 

24540 

24600 
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AGCGGCGACA ACGCTTAATG ACGCGATGAC ACAATTGAAA CAAGGTATTG 


CGAATAAAGC 


24720 




ACAAATTAAA 


GGTAGCGAGA ACTATCACGA TGCTGATACT GACAAGCAAA 


CAGCATATGA 


24780 


5 


TAATGCAGTA ACAAAAGCAG AAGAATTGTT AAAACAAACA ACAAATCCAA 


CAATGGATCC 


24840 




AAATACAATT 


CAACAAGCAT TAACTAAAGT GAATGACACA AATCAAGCAC 


TTAACGGTAA 


24900 




TCAAAAATTA 


GCTGATGCCA AACAAGATGC TAAGACAACA CTTGGTACAC 


TAGATCATTT 


24960 


10 


AAATGATGCT 


CAAAAACAAG CGCTAACAAC TCAAGTTGAA CAAGCACCAG 


ATATTGCAAC 


25020 




AGTTAATAAT GTTAAGCAAA ATGCTCAAAA TCTGAATAAT GCTATGACTA ACTTAAACAA 


25060 


15 


TGCATTACAA 


GATAAAACTG AGACATTAAA TAGCATTAAC TTTACTGATG 


CAGATCAAGC 


25140 


TAAGAAAGAT GCTTATACTA ATGCGGTTTC ACATGCAGAA GGTATTTTAT 


CTAAAGCAAA 


25200 




TGGCAGCAAT 


GCAAGTCAAA CTGAAGTGGA ACAAGCGATG CAACGTGTGA 


ACGAAGCGAA 


25260 


20 


ACAAGCATTG 


AATGGTAATG ACAATGTACA ACGTGCAAAA GATGCAGCGA 


AACAAGTGAT 


25320 




TACAAATGCA 


AATGATTTAA ATCAAGCAAT GACACAATTG AAACAAGGTA 


TTGCAGATAA 


25380 




AGACCAAACT 


AAAGCAAATG GTAACTTTGT CAATGCTGAT ACTGATAAGC 


AAAATGCTTA 


25440 


25 


CAACAATGCG 


GTAGCACATG CTGAACAAAT AATTAGTGGT ACACCAAATG 


CAAACG TGG A 


25500 




TCCACAACAA 


GTGGCTCAAG CGTTACAACA AGTGAATCaA GCTAAGGGTG 


ATTTAAACGG 


25560 




TAACCATAAC 


TTACAAGTTG CTAAAGACAA TGCAAATACA GCCATTGATC 


Aol IAV.LAAA 


25620 


30 


CTTAAATCAA 


CCACAAAAAA CAGCATTAAA AGACCAAGTG TCGCATGCAG 


AACTTGTTAC 


25680 




AGGTGTTAAT 


GCTATTAAGC AAAATGCTGA TGCGTTAAAT AATGCAATGG 


GTACATTGAA 


25740 




ACAACAAATT 


CAAGCGAACA GTCAAGTACC ACAGTCAGTT GACTTTACAC 


AAGCGGATCA 


25800 


35 


AGACAAACAA 


CAAGCATATA ACAATGCGGC TAACCAAGCG CAACAAATCG 


CAAATGGCAT 


25860 




ACCAACACCT GTATTGACGC CTGATACAGT AACACAAGCA GTGACAACTA 


TGAATCAAGC 


25920 


40 


GAAAGATGCA 


TTAAACGGTG ATGAAAAATT AGCACAAGCG AAACAAGAAG 


CTTTAGCAAA 


25980 


TCTTGATACG 


TTACGCGATT TAAATCAACC ACAACGTGAT GCATTACGTA 


ACCAAATCAA 


26040 




TCAAGCACAA 


GCGTTAGCTA CAGTTGAACA AACTAAACAA AATGCACAAA 


ATGTGAATAC 


26100 


45 


aGCaATGAGT 


AACTTGAAAC aAGGTATTGC aAACAAAGAT ACTGTCAAAG 


CAAGTGAGAA 


26160 




CTATCATGAT 


GCTGATGCCG ATAAGCAAAC AGCATATACA AATGCAGTGT 


CTCAAGCGGA 


26220 




AGGTATTATC 


AATCAAACGA CAAATCCAAC GCTTAACCCA GATGAAATAA 


CACGTGCATT 


26280 


50 


AACTCAAGTG 


ACTGATGCTA AAAATGGCTT AAACGGTGAA GCTAAATTGG 


CAACTGAAAA 


26340 




GCAAAATGCT 


AAAGATGCCG TAAGTGGGAT GACGCATTTA AACGATGCTC 


AAAAACAAGC 


26400 
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AGCAACGAGC CTAGATCAAG CAATGGATCA ATTATCACAA GCTATTAATG ATAAAGCTCA 26520 

AACATTAGCG GACGGTAATT ACTTAAATGC AGATCCTGAC AAACAAAATG CGTATAAACA 26580 

GGCAGTAGCA AAAGCTGAAG CATTATTGAA TAAACAAAGT GGTACTAATG AAGTACAAGC 26640 

ACAAGTTGAA AGCATCACTA ATGAAGTGAA CGCAGCGAAA CAAGCATTAA ATGGTAATGA 26700 

CAATTTGGCA AATGCAAAAC AACAAGCAAA ACAACAATTG GCGAACTTAA CACACTTAAA 26760 

TGATGCACAA AAACAATCAT TTGAAAGTCA AATTACACAA GCGCCACTTG TTACAGATGT 26820 

CACTACGATT AATCAAAAAG CACAAACGTT AGATCATGCG ATGGAATTAT TAAGAAATAG 26880 

TGTTGCGGAT AATCAAACGA CATTAGCGTC TGAAGATTAT CATGATGCAA CTGCGCAAAG 26940 

ACAAAATGAC TATAACCAAG CTGTAACAGC TGCTAATAAT ATAATTAATC AAACTACATC 27000 

GCCTACGATG AATCCAGATG ATGTTAATGG TGCAACGACA CAAGTGAATA ATACGAAAGT 27060 

TGCATTAGAT GGTGATGAAA ACCTTGCAGC AGCTAAACAA CAAGCAAACA ACAGACTTGA .27120 

TCAATTAGAT CATTTGAATA ATGCGCAAAA GCAACAGTTA CAATCACAAA TTACGCAATC 27180 

ATCTGATATT GCTGCAGTTA ATGGTCACAA ACAAACAGCA GAATCTTTAA ATACTGCGAT 27240 

GGGTAACTTA ATTAATGCGA TTGCAGATCA TCAAGCCGTT GAACAACGTG GTAACTTCAT 27300 

CAATGCTGAT ACTGATAAAC AAACTGCTTA TAATACAGCG GTAAATGAAG CAGCAGCAAT 27360 

GATTAACAAA CAAACTGGTC AAAATGCGAA CCAAACAGAA GTAGAACAAG CTATTACTAA 27420 

AGTTCAAACA ACACTTCAAG CGTTAAATGG AGACCATAAT TTACAAGTTG CTAAAACAAA 27480 

TGCGACGCAA GCAATTGATG CTTTAACAAG CTTAAATGAT CCTCAAAAAA CAGCATTAAA 27540 

AGACCAAGTT ACAGCTGCAA CTTTAGTAAC TGCAGTTCAT CAAATTGAAC AAAATGCGAA 27600 

TACGCTTAAC CAAGCAATGC ATGGTTTAAG ACAGAGCATT CAAGATAACG CAGCAACTAA 27660 

AGCflAATAGC AAATATATCA ACGAAGATCA ACCAGAGCAA CAAAACTATG ATCAAGCTGT 27720 

TCAAGCCGCA AATAATATTA TCAATGAACA AACTGCAACA TTAGATAATA ATGCGATTAA 27780 

TCAAGCAGCG ACAACTGTGA ATACAACGAA AGCAGCATTA CATGGTGATG TGAAGTTACA 27840 . 

AAATGATAAA GATCATGCTA AGCAAACGGT TAGTCAATTA GCACATCTAA ACAATGCACA 27900 

AAAACATATG GAAGATACGT TAATTGATAG TGAAACAACT AGAACAGCAG TTAAGCAAGA 27960 

TTTGACTGAA GCACAAGCAT TAGATCAACT TATGGATGCA TTACAACAAA GTATTGCTGA 28020 

CAAAGATGCA ACACGTGCGA GCAGTGCATA TGTCAATGCA GAACCGAATA AAAAACAATC 28080 

CTATGATGAA GCAGTTCAAA ATGCTGAGTC TATCATTGCA GGATTAAATA ATCCAACTAT 2814 0 

CAATAAAGGT AATGTATCAA GTGCGACTCA AGCAGTAATA TCATCTAAAA ATGCATTAGA 28200 
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TCAATTAACA 
TGATAAAGTG 
AAAAGAAAGT 
TCAAGCGCAA 
AACAACTGAT 
TGCTAAAAAC 
AACGTTAAAT 
TAATAATGCA 
CCAAGCAATG 
CAAGTTTATC 
AAAAGATTTA 
GACACAAGCT 
TAAACAACAT 
AGCACTTGAA* 
TGAAGCAAAA 
ACAAACAGAA 
AGCAGCAGTT 
ATCACAAGTA 
TCAAAAACTT 
AAATCATGCA 
GGTTGCACAA 
TAAAGTTGAT 
TAAAAAAGAA 
TGGTTCAAAT 
AAATGAGTTA 
TGACCAATTA 
AGCGACGAAA 
TATGGATCAA 
CACACAAGCA 



CCAGCTCAAC 
GCTGAAATCA 
ATTAAGGATC 
AAAGATGCTT 
CCTACATTAG 
AATTTACATG 
AACTTGTCTA 
GCAACTCGTG 
GAAGCTTTAC 
AATGAAGATA 
ATTAATCAAA 
GTTAACCAAG 
GCGGTTACTG 
AGCCAAATAA 
GCGCTTGATC 
TCTGGTAGCA 
CAAAATGCAA 
GAACAATTGA 
GCTCGTGATC 
CAACAACAAG 
CATGTTCAAA 
CAAGTGAATA 
GCAGTAGATC 
GCGAATAAAG 
AATGGTAATG 
ACACATTTAA 
CTTCAACCAA 
TTACAACAAG 
GATTCAGATA 



AACAAGCGCT 
TTGCACAAGC 
AACCACAAAC 
ATACGCAAGC 
CTAAATCAAT 
GTGATCAAAA 
ACTTGAATAC 
GCGAAGTAGC 
GTAATAGCAT 
AaCCaCmAAA 
CTAACAATCC 
CTAAAGATAA 
ATTTAAATCA 
ACAACGCAGC 
AAGCAATGCA 
AGTTTATCAA 
AAGATTTAAT 
CACAAGCAGT 
AACAACAAGC 
CATTAACTGA 
CTGCTACTGA 
CAGATAAGGC 
AAGCGTTACA 
ACGCTGTAGA 
AGAGAGTCGC 
ATGCTGATCA 
TTGCTGAATT 
CAGTTAATGA 
AACAAAATGC 



AGAAAATCAA 
GCAAgCATtA 
TGAAGCAAGT 
AGTACAACAC 
CATTGATCAA 
ACTAGCTCAA 
ACCACAACGT 
ACAAAAATTA 
TCAAGATCAA 
AGrTGCTTAC 
AACGCTTGAT 
CCTACACGGT 
ATTAAATGGT 
AACTCGTGGC 
AGCATTACGT 
TGAAGATAAA 
TAACCAAACA 
AACAACTGCA 
AGTAACAACT 
TGCTATAAAT 
ACTTGATCAC 
TCAACCAAAT 
AGCTGCAGAA 
CCAAGTATTA 
TGAAGCTAAA 
AATTGCAACT 
AGTAGATCAA 
ACATGCTAAC 
TTATAAACAA 



ATTAATAATG 
AATGAAGCGA 
AGTAAATTTA 
GCGAAAGATT 
GCGACACAGG 
GATAAGCAAC 
CAAGCACTTG 
ACTGAAGCAC 
CAGCAAACGG 
CAAGCAGCAG 
AAAGCACAAG 
GATCAAAAAC 
TTGAATAATC 
GAAGTAGCAC 
AATAGTATTC 
CCGCAAAAAG 
GGTAATCCAA 
AAAGATAATC 
GTAAATGCAT 
GCAGCGCCTA 
GCGATGGAAA 
TACACTGAAG 
AGCATTACAG 
ACTAAGCTTC 
ACACAAGCGA 
GCTAAACAAA 
GCAACGCAAT 
GTTGAGCAAA 
GCTATTGCTG 



CAACAACTCG 
TGAAAGCATT 
TTAACGAGGA 
TGATTAACAA 
CAGTGACAGA 
GTGCAACAGA 
AAAATCAAAT 
AAGCACTTAA 
AAGCGGGTAG 
TTCAAAATGC 
TTGAACAATT 
TTGCAGACGA 
CGCAACGTCA 
AAAAATTAGC 
AAGATCAACA 
ATGCTTACCA 
CACTCGACAA 
TACATGGTGA 
TGCCAAACTT 
CAAGAACAGA 
CATTGAAAAA 
CGTCAACTGA 
ATCCAACTAA 
AAGAAAAAGA 
AACAAACTAT 
ACATTGATCA 
TGAATCAATC 
CTGTAGATTA 
ATGCTGAAAA 



28320 
28380 
28440 
28500 
28560 
28620 
28680 
28740 
28800 
28860 
28920 
28980 
29040 
29100 
29160 
29220 
29280 
29340 
29400 
29460 
29520 
29530 
29640 
29700 
29760 
29820 
29880 
29940 
30000 
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TGCAAAACAA GCATTAAATG GTGATGAACG TGTAGCACTT GCTAAAACAA ATGGTAAACA 3012 0 

TGACATCGAC CAATTGAATG CATTAAACAA TGCTCAACAA GATGGATTTA AAGGTCGCAT 30160 

CGATCAATCA AACGATTTAA ATCAAATCCA ACAAATTGTA GATGAGGCTA AGGCACTTAA 3024 0 

TCGTGCAATG GATCAATTGT CACAAGAAAT CACTGACAAT GAAGGACGCA CGAAAGGTAG 30300 

CACGAACTAT GTCAATGCAG ATACACAAGT CAAACAAGTA TATGATGAAA CGGTTGATAA 30360 

AGCGAAACAA GCACTTGATA AATCGACTGG TCAAAACTTA ACTGCAAAAC AAGTTATCAA 30420 

ATTAAATGAT GCAGTCACTG CAGCTAAGAA AGCATTAAAT GGTGAAGAAA GACTTAATAA 30480 

TCGTAAAGCT GAAGCATTAC AAAGATTGGA TCAATTAACA CATCTAAACA ATGCTCAAAG 30540 

ACAATTAGCA ATCCAACAAA TTAATAATGC TGAAACGCTA AATAAAGCAT CTCGAGCAAT 30600 

TAATAGAGCA ACTAAATTAG ATAATGCAAT GGGTTCAGTA CAACAATATA TTGACGAACA 30660 

GCACCTTGGT GTTATCAGCA GCACAAATTA CATCAATGCA GATGACAATT TGAAAGCAAA 30720 

TTATGATAAT GCAATTGCGA ATGCAGCACA TGAGTTAGAT AAAGTGCAAG GTAATGCAAT 3 07 BO 

TGCaAAAGCT GAAGCAGAGC AATTGAAACA AAATATTATC GATGCTCAAA ATGCATTAAA 30840 

TGGAGACCAA AACCTTGCAA ATGCCAAAGA TAAAGCAAAT GCGTTTGTTA ATTCGTTAAA 3 0900 

TGGATTAAAT CAACAGCAAC AAGATCTTGC ACATAAAGCA ATTAACAATG CCGATACTGT 30960 

ATCAGATGTA ACAGATATTG TTAATAATCA AATTGACTTA AATGATGCAA TGGAAACATT 31020 

GAAACATTTA GTTGACAATG AAATTCCAAA TGCAGAGCAA ACTGTCAATT ACCAAAACGC 31080 

TGACGATAAT GCTAAA 31096 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



45 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ATGACAGAAT GGGAGCGAGG ACTTAGAATG TTTCCTAAAT CAGGTTTATT AAATTTTGAG 60 

TTAGCGATAG mAAATCGTTC ATTAAATGAT GATGAAAAAG CATTAAAATA TGTGCGTAAA - 120 

GCATTAAATG CAGACCCTAA AAATACAGAT TATATTAACT TAGAAAAAGA GTTGACTAAA 180 

TCAAATGAGT CGAAAAATAA ATAACTTTTA TGATGTACAA CAGTTATTGA AAAGTTACGG 240 

ATTTCTAATA TATTTTAAAA ATCCAGAAGA TATGTACGAA ATGATTCAAC AGGAGATTTC 3 00 



55 



469 



EP0 786 519 A2 



10 



15 



TAATGAGAGA AGGAATGAAC AGAAATGACA AAAATTATTT TAGCAGCTGA TGTAGGCGGG 420 

ACGACTTGTA AATTAGGTAT TTTCACACCT GAATTAGAAC AATTACATAA ATGGTCTATT 4 BO 

CACACTGATA CATCTGATAG TACAGGATAT ACACTTTTGA AAGGAATTTA TGATTCGTTT 540 

GTTGAAAAAG TAAATGAAAA TAATTATAAT TTTTCAAATG TACTTGGCGT AGGTATTGGT 600 

GTACCAGGTC CTGTTGACTT TGAAAAAGGT ACAGTAAATG GAGCAGTAAA CTTATATTGG 660 

CCAGAAAAAG TTAATGTACG TGAGATTTTT GAACAATTCG TTGATTGTCC AGTGTATGTA 720 

GATAATGATG CTAACATAGC TGCTTTAGGG GaGAAACACA AAGGTGCTGG TGAAGGTGCC 780 

GATGATGTTG TTGCCATCAC ACTTGGTACA GGTCTAGGTG GAGGAATTAT TTCCAAATGG 840 

TGAAATCGTA CATGGTCATA ATGGCTCtGG CGCAGAAATA GGTCATTTTA . GAgCAGACTT 900 

CgATCAACGA TTTaAATGTA ATTGTGGTCG TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

20 GACAGGCGTT GTTAACTTAG TTAACTTCtA CTATCCGAAG TTGACGTTTA GATCTTCTAT 102 0 

ATTAGAATTG ATTAAAGAAA ATAAGGTtAC aGCAAAAGCT GTTTTTGATG CGGCAAAAGC 1080 

TGGTGACCAA TTCTGTATTT TCATTACTGA AAAGGTTGCA AACTATATTG GATATTTATG 1140 

25 TAGTATTATT AGTGTTACAA GTAATCCGAA ATATATCGTT CTAGGTGGAG GAATGTCTAC 1200 

TGCAGGACCT ATTTTAATTG AAAATATTAA AACAGAATAT CATAATTTAA CATTTGCACC 1260 

TGCTCAATTT GAAACTGAAA TTGTACAAGC GAAATTAGGT AATGATGCAG GTATTACAGG 1320 

AGCAGCAGGA TTAATCAAGA CCTATGTATT AGATAAAGAG GGGGTAAAAT AATGGCTATT 1380 

GTTGATGTGG TTGTTATTCC AGTTGGAACG GAAGGTCCGA GTGTTAGTAA ATATATTGCA 1440 

GATATTCAGA AAAAACTTCA AGAATATAAA GCAATGGGTA AAATTGATTT TCAATTAACA 1500 

CCAATGAATA CTCTAATTGA AGGTGAATTA AGCGATGTAT TAGAAGTTGT GCAAGTGATA 1560 

CATG^ATTAC CTTTTGATAA AGGTTTAAGT AGAGTTTGTA CAAATATCCG TATTGATGAC 1620 

CGACGAGACA AATCTAGAAA AATGAATGAT AAACTAACAT CAGTACAAAA ACATTTAGAA 1680 

AATAGTGGTG AAAACCTATG AGGATTTCAA GCTTAACTTT AGGCTTAGTT GATACTAATA 1740 

CGTATTTCAT CGAAAATGAC AAAGCTGTTA TTCTGATTGA CCCTTCAGGT GAAAGTGAAA 18 00 

45 AAATTATTAA AAAATTAAAC CAAATAAATA AACCGTTAAA AGCTATTTTA TTAACACATG 1860 

CACACTTTGA TCATATCGGA GCAGTCGATG ATATAGTTGA TCGATTCGAT GTCCCGGTTT 1920 

ATATGCATGA AGCAGAGTTT GATTTTCTAA AAGATCCCGT TAAAAATGGG GCAGATAAAT 1980 

50 TTAAGCAATA TGGATTACCA ATTATTACAA GTAAGGTAAC TCCTGAAAAG TTAAmCGAAG 204 0 

GTAGCACAGA AATAGAAGGA TTTAAGTTnT nAyrTGTaCA CACACCTGGA CATTCACCAG 2100 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 2220 
ATAAAATATT TGAATTAGAA GGC 224 3 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

TTGGnATCAT tyAcgGTAAA AAGAATAAaG CAAGATTtAT TTCATTAGTA CTAATTTGTG 60 

CAATGTTTGC AATTTGTTGG GTTGCATATA TTCAATGGGA GTCTACAATC GCTTCATTTA 120 

CACAATCTAT TAATATTTCa ATGGCACAAT ATAGTGTTTT ATGGACAATT AACGGAATAA 180 

TGATTTTAGT AGCACAACCA TTAATTAAAC CGATTCTCTA TCTGTTAAAA GGAAACTTAA 24 0 

AGAAGCAAAT GTTTGTCGGC ATCATCATTT TTATGTTGTC GTTCTTTGTC ACGAGTTTTG 3 00 

25 CCGAAAACTT TACAATATTT GTTGTCGGTA TGATTATTTT AACTTTTGGA GAAATGTTTG 360 

TATGGCCAGC AGTTCCAACT ATAGCCAATC AGTTAGCGCC AGATGGTAAG CAAGGACAGT 420 

ACCAAGGTTT TGTGAATTCA GCTGCTACAG TAGGAAAAGC ATTTGGTCCA TTTCTTGGTG 4 80 

GTGTATTAGT TGATGCGTTT AATATGCGCA TGATGTTTAT CGGTATGATG CTACTACTTG 54 0 

TATTTGCATT AATATTATTA ATGGTTTTCA AGGAGAATAA TACGCAACCT AAAAAAATAG 600 

ATGCATAATG AGTAAATAGA ATTAACGTTA TAGACTTGAA ATAAATGTCG TTATAACATA 660 

ATATTAATTT GTATAATTTA ATTTCGTTTG GAGCTTTTCT ACAGAAAGCT AGTGATGCTG 720 

AGAGCTAGTG TTAAGGACTA AATGTAAATC GTATTAATTT TAAATTGAAT GAATGACATC 780 

TCTTACTATT AAAATGAGTG CACAATTTTT GTGAAATAGG GTGGTAACGC GGCAAATGTC 84 0 

GTCCCTATGT AAATAGAATA GTTAGAGGTG TCTTTTTTAT TGAATAGGAG GAAATGTGTT 900 

GAATTACAAC CACAATCAAA TTGAAAAGAA ATGGcAAGAC TATTGGGACG AAAATAAAAC 960 

ATTTAAAACA AATGATAACT TAGGTCAAAA GAAATTTTAT GCTTTAGACA TGTTTCCATA 1020 

TCCATCAGGT GCTGGTTTAC ATGTTGGACA TCCTGAGGGc TATACAGCAA CAGATATCAT 1080 

TTCAAGATAT AAAAGAATGC AAGGATATAA TGTATTACAT CCGATGGGGT GGGATGCATT 114 0 

50 CGGATTACCA GCAGAGCAAT ATGCTTTAGA CACTGGCAAC GACCCACGTG AATTTACAAA 1200 

GAAAAATATC CAAACTTTTA AACGACAAAT TAAAGAATTA GGGTTCAGTT ATGATTGGGA 1260 
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GTTATATAAC 


AAAGGTTTAG 


CATACGTTGA 


TGAAGTTGCA 


GTTAACTGGT 


GTCCAGCATT 


1380 




AGGCACTGTT 


TTATCTAACG 


AAGAAGTGAT 


TGATGGTGTC 


TCTGAACGTG 


GTGGACATCC 


1440 


5 


AGTTTATCGT AAGCCGATGA AACAATGGGT ACTTAAAATC ACAGAATATG 


CAGATCAATT 


1500 




ATTAGCAGAT 


TTAGATGATT 


TAGATTGGCC 


TGAGTCTTTA 


AAAGATATGC 


AGCGCAATTG 


1560 


70 


GATTGGACGT 


TCTGAAGGGG 


CCAAAGTTTC 


ATTTGATGTA 


GATAATACGG 


AAGGAAAAGT 


1620 


AGAAGTATTT 


ACGACTAGAC 


CAGATACAAT 


CTATGGTGCA 


TCATTCTTAG 


TCTTAAGTCC 


1680 




TGAACATGCA 


TTAGTTAATT 


CAATTACAAC 


AGATGAATAT 


AAAGAAAAAG 


TAAAAGCTTA 


1740 


75 


TCAAACAGAA GCTTCTAAAA AGTCAGATTT AGAACGTACA GATTTAGCAA 


AAGATAAATC 


1800 


AGGTGTATTT 


ACTGGTGCAT 


ATGCAACTAA 


TCCTTTATCT 


GGTGAAAAAG 


TACAAATTTG 


1860 




GATTGCTGAT 


TATGTATTAT 


CAACATATGG 


TACTGGAGCA 


ATTATGGCAG 


TACCAGCGCA 


1920 


20 


TGATGACAGA 


GATTATGAAT 


TTGCTAAAAA 


GTTTGATTTG 


CCAATCATTG 


AAGTCATCGA 


1980 




AGGTGGAAAT 


GTTGAAGAAG 


CAGCATACAC 


TGGTGAAGGT 


AAACATATTA 


ATTCTGGTGA 


2040 




ACTTGATGGT 


TTAGAAAATG 


AAGCGGCAAT 


TACTAAAGCT 


ATTCAATTAT 


TAGAGCAAAA 


2100 


25 


AGGTGCTGGC 


GAAAAGAAAG 


TTAATTACAA 


ATTAAGAGAT 


TGGTTATTCA 


GTCGTCAGCG 


2160 




TTATTGGGGC 


GAACCAATTC 


CTGTCATTCA 


TTGGGAAGAT 


GGAACAATGA 


CAACTGTTCC 


2220 




TGAAGAAGAG 


CTACCATTGT 


TGTTACCTGA 


AACAGATGAA 


ATCAAGCCAT 


CAGGGACTGG 


2280 


30 


TGAGTCTCCA 


CTAGCTAATA 


TTGATTCATT 


TGTAAATGTT 


GTAGATGAAA 


AAACAGGTAT 


2340 




GAAAGGACGT 


CGTGAAACAA 


ATACAATGCC 


ACAATGGGCA 


GGTAGTTGTT 


GGTATTATTT 


2400 


35 


ACGTtACATC 


GATCCTAAAA 


ATGAAAATAT 


GTTAGCAGAT 


CCTGAAAAAT 


TAAAACATTG 


2460 


GTTACCTGTT 


GATTTATATA 


TCGGTGGAGT 


AGAACATGCG 


GTTCTTCACT 


TATTATATGC 


2520 




AAGATTTTGG CATAAAGTCC TTTATGATTT GGCTATCGTA CCTACTAAAG 


AACCTTTCCA 


2580 


40 


AAAATTATTT 


AACCAAGGTA 


TGATTTTAGG 


AGAAGGTAAT 


GAGAAGATGA 


GTAAATCTAA 


2640 




AGGAAATGTA 


ATCAATCCTG 


ATGATATAGT 


ACAGTCTCAT 


GGTGCAGATA 


CTTTGCGTCT 


2700 




TTACGAAATG 


TTTATGGGAC 


CTTTAGATGC 


TGCAATTGCA 


TGGAGTGAAA 


AAGGATTAGA 


2760 


45 


TGGGTCTCGT 


CGATTCTTAG 


ATCGCGTATG 


GCGTTTAATG 


GTAAATGAAG 


ATGGGACATT 


2820 




GAGTTCAAAA 


ATTGTAACTA 


CAAATAATAA 


ATCTTTAGAT 


AAAGTTTATA 


ACCAAACTGT 


2880 




TAAAAAGGTA 


ACAGAAGACT 


TTGAAACATT 


AGGATTTAAT 


ACTGCTATTA 


GTCAATTAAT 


2940 


SO 


GGTATTTATT 


AATGAGTGTT 


ATAAAGTTGA 


TGAAGTTTAT 


AAACCTTACA 


TTGAAGGCTT 


3000 




CGTTAAAATG 


TTAGCACCTA 


TTGCACCACA 


TATCGGTGAA 


GAATTATGGT 


CAAAATTAGG 


3060 
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TGATGAAGTA GAAATCGTTG TTCAAGTGAA TGGTAAATTG AGAGCTAAAA TTAAAATTGC 3180 

TAAAGATACA TCAAAAGAAG AAATGCAAGA AATTGCCTTA TCTAATGACA ATGTTAAAGC 3240 

GAGTATTGAA GGTAAAGACA TCATGAAAGT CATCGCTGTT CCTCAAAAAT TAGTCAATAT 3300 

TGTAGCTAAA TAATGTTTTA AGGAGGACTT TGAAATGAAG TCAATTACTA CAGATGAATT 3360 

AAAAAATAAA CTTTTAGAAT CTAAACCAGT TCAAATTGTT GATGTTCGTA CTGATGAAGA 3420 

AACAGCAATG GGATATATTC CTAATGCAAA GTTAATTCCA ATGGATACCA TTCCGGATAA 3480 

TTTAAATTCA TTTAATAAAA ATGAAATATA TTATATTGTA TGTGCTGGTG GAGTTCGAAG 354 0 

CGCTAAAGTT GTAGAATATT TAGAGGCAAA TGGCATTGAT GCCGTAAATG TCGAAGGCGG 3600 

CATGCACGCA TGGGGCGATG AAGGTTTGGA AATAAAAAGT ATTTAAAGTA GTGACATAAT 3660 

TTAAAATAAT ATTACATTTG TAATGACACC AAGTAACGTT TCGGTTGCTT GGTGTTTTTT 3720 

GGTATGAATT ACTTTCTGTT ACAAAACAAT CTAAAGCGTT CTTGTTATGT TTTATTAAGA 3780 

TTTTAATTAC AAAACGGAAA CTAAATTGTA ATAAAATAAA ACTTTATTTT ATAAAATGAT 3840 

GATGATAAAA TTGAGTGAAC TTAAAATATT GTACAAAATA AT AT AG CT AT AAATATAATA 3900 

25 TAGCTATAAA TATAATATGA GGGAGCGTAT ATTTTTAGCA TAATTCTTAA CAACACAGCA 3960 

GAGAACAGAC AACCAGGAGG AAAATGAAAT GAATTTGTTA AAGAAAAATA AATATAGTAT 4020 

TAGGAAGTAT AAAGTAGGCA TATTCTCTAC TTTAATCGGA ACAGTTTTAT TACTTTCAAA 4080 

30 CCCAAATGGT GCACAAGCCT TAACTACGGA TAATAATGTA CAAAGCGATA CTAATCAAGC 4140 

AACACCTGTA AATTCACAAG ATAAAGATGT TGCTAATAAT AGAGGTTTAG CAAATAGTGC 4200 

GCAGAATACA CCTAATCAAT CTGCAACAAC CAATCAAGCA ACGAATCAAG CATTGGTTAA 4260 

TCATAATAAT GGTAGTATAG TAAATCAAGC TACGCCAACA TCAGTGCAAT CAAGTACGCC 4320 

TTCAGCACAA AACAATAATC ATACAGATGG CAATACAACA GCAACTGAGA CAGTGTCAAA 4380 

CGCTAATAAT AATGATGTAG TGTCGAATAA TACCGCATTA AATGTACCAA CTAAAACAAA 4440 

TGAAAATGGT TCAGGAGGAC ATCTAACTTT AAAGGAAATT CAAGAAGATG TTCGTCATTC 4500 

TTCAAATAAA CCAGAGCTAG TTGCAATTGC TGAACCAGCA TCTAATAGAC CGAAAAAGAG 4 560 
AAGTAGACGT GCGGCACCGG CAGATCCTAA TGCAACTCCA GCAGATCCAG CGGCTGCAGC . 4620 

GGTAGGAAAC GGTGGTGCAC CAGTTGCAAT TACAGCGCCA TATACGCCAA CAACTGATCC 4680 

TAATGCCAAT AATGCAGGAC AAAATGCACC TAACGAAGTG CTGTCATTTG ATGACAATGG 474 0 

50 TATTAGACCA AGTACCAACC GTTCTGTGCC AACAGTAAAC GTTGTTAATA ACTTGCCGGG 4800 

CTTCACACTA ATCAATGGTG GCAAAGTAGG GGTGTTTAGT CATGCAATGG TAAGAACGAG 4 860 
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TCGTATACAT GGAACTGATA CGAATGACCA TGGCGATTTT AATGGTATCG AGAAAGCATT 4 980 

AACAGTAAAT CCGAATTCTG AATTAATCTT TGAATTTAAT ACAATGACTA CTAAAAACGG 504 0 

TCAAGGCGCA ACAAATGTTA TTATCAAAAA TGCTGATACT AATGATACGA TTGCTGAAAA 5100 

GACTGTTGAA GGCGGTCCAA CTTTGCGTTT ATTTAAAGTA CCTGATAATG TGAGAAATCT 5160 

CAAAATTCAA TTTGTACCTA AAAATGACGC AATAACAGAT GCGCGTGGCA TTTATCAACT 5220 

AAAAGATGGT TACAAATACT ATAGCTTTGT TGACTCTATC GGACTTCATT CTGGGTCACA 5280 

TGTTTTTGTT GAAAGACGAA CAATGGATCC AACAGCAACA AATAATAAAG AGTTTACTGT 5340 

AACAACATCA TTAAAGAATA ATGGTAATTC TGGTGCTTCT CTAGATACAA ATGACTTTGT 5400 

ATATCAAGTT CAATTACCTG AAGGTGTTGA ATATGTGAAC AATTCATTGA CTAAAGATTT 5460 

TCCAAGTAAC AATTCAGGCG TTGATGTTAA TGATATGAAT GTTACATATG ATGCAGCAAA 5520 

TCGTGTGATA ACAATTAAAA GTACTGGAGG AGGTACAGCA AACTCTCCGG CACGACTTAT 5580 

GCCTGATAAA ATACTCGATT TAAGATATAA ATTACGTGTA AATAATGTGC CGACACCAAG 5640 

AACAGTAACA TTTAACGAGA CATTAACGTA TAAAACATAT ACACAAGATT TCATTAATTC 5700 

25 AGCTGCAGAA AGTCATACTG TAAGTACAAA TCCATATACT ATCGATATCA TCATGAATAA 5760 

AGATGCATTA CAAGCCGAAG TTGACAGACG TATTCAACAA GCTGATTATA CATTTGCGTC 5820 

ATTAGATATC TTTAATGGTC TGAAACGACG CGCACAAACG ATTTTAGATG AAAATCGTAA 5880 

CAATGTACCA TTAAATAAAA GAGTTTCTCA AG CAT AT ATT GATTCATTAA CTAATCAAAT 5940 

GCAACATACG TTAATTCGAA GTGTTGATGC TGAAAATGCA GTTAATAAAA AAGTTGACCA 6000 

AATGGAAGAT TTAGTTAATC AAAATGATGA ATTGACAGAT GAAGAAAAAC AAGCAGCAAT 6060 

ACAAGTTATC GAGGAACATA AAAATGAAAT AATTGGTAAT ATTGGTGACC AAACGACTGA 6120 

TGATGGCGTT ACTAGAATCA AAGATCAAGG TATACAGACC TTAAGTGGGG ATACTGCAAC 6180 

ACCGGTTGTT AAACCAAATG CTAAAAAAGC AATACGTGAT AAAGCAACGA AACAAAGGGA 6240 

AATTATCAAT GCAACACCAG ATGCTACTGA AGACGAGATT CAAGATGCAC TAAATCAATT 6300 

AGCTACGGAT GAAACAGATG CTATTGATAA TGTTACGAAT GCTACTACAA ATGCTGACGT 6360 

TGAAACAGCT AAAAATAATG GCATCAATAC TATTGGAGCA GTTGTTCCTC AAGTAACTCA 6420 

TAAAAAAGCT GCAAGAGATG CAATTAACCA AGCAACAGCA ACGAAAAGAC AACAAATAAA 6480 

TAGTAATAGA GAAGCAACTC AGGAAGAGAA AAATGCAGCA TTGAACGAAT TAACTCAAGC 654 0 

50 AACCAACCAT GCTTTAGAAC AAATCAATCA AGCAACAACA AATGCTAATG TTGATAACGC 6600 

CAAAGGAGAT GGTCTAAATG CCATTAATCC AATTGCTCCT GTAACTGTTG TTAAGCAAGC 6660 
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TGATGCGACT CAAGAAGAAA GACAAGCAGC AATTGACAAA GTGAATGCTG CTGTAACTGC 6780 

AGCAAACACA AACATTTTAA ACGCTAATAC CAATGCTGAT GTTGAACAAG TAAAGACAAA 6840 

5 TGCGATTCAA GGAATACAAG CAATTACACC AGCTACAAAA GTAAAAACAG ATGCAAAAAA 6900 

TGCCATCGAT AAAAGTGCGG AAACGCAACA TAATACGATA TTTAATAATA ATGATGCGAC 6960 

GCTCGAAGAA CAACAAGCAG CACAACAATT ACTTGATCAA GCTGTAGCCA CAGCGAAGCA 7020 

10 

AAATATTAAT GCAGCAGATA CGAATCAAGA AGTTGCACAA GCAAAAGATC AGGGCACACA 7080 

AAATATAGTA GTGATTCAAC CGGCAACACA AGTTAAAACG GATACTCGCA ATGTTGTAAA 7140 

TGATAAAGCG CGAGAGGCGA TAACAAATAT CAATGCTACA ACTGGCGCGA CTCGAGAAGA 7200 

15 

GAAACAAGAA GCGATAAATC GTGTCAATAC ACTTAAAAAT AGAGCATTAA CTGATATTGG 7260 

TGTGACGTCT ACTACTGCGA TGGTCAATAG TATTAGAGAC GATGCAGTCA ATCAAATCGG 7320 

CGCAGTTCAA CCGCATGTAA CGAAGAAACA AACTGCTACA GGTGTATTAA ATGATTTAGC 73 80 

20 

AACTGCTAAA AAGCAAGAAA TTAATCAAAA CACAAATGCA ACAACTGAAG AAAAGCAAGT 7440 

GGCTTTAAAT CAAGTGGATC AAGAGTTAGC AACGGCAATT AATmATATAA ATCAAGCTGA 7500 

25 TACAAATGCG GAAGTAGATC AAGCGCAACA ATTAGGTACA AAAGCAATTA ATGCGATTCA 7560 

GCCAAATATT GTTAAAAAAC CTGCAGCATT AGCACAAATC AATCAGCATT ATAATGCTAA 7620 

ATTAGCTGAA ATCAATGCTA CACCAGATGC AACGAATGAT GAGAAAAATG CTGCGATCAA 76 80 

30 TACTTTAAAT CAAGACAGAC AACAAGCTAT TGAAAGTATT AAACAAGCTA ACACAAATGC 7740 

AGAAGTAGAC CAAGCTGCGA CAGTAGCAGA GAATAATATC GATGCTGTTC AAGTTGATGT 7800 

AGTAAAAAAA CAAGCAGCGC GAGATAAAAT CACTGCTGAA GTGGcGAacG TATTGaAGCG 7860 

35 GTTAAACAAA CACCTAATGC AACTGACGAA GAAAAGCAGG CTGCTGTTAA TCAAATCCAA 7920 

TCAACTTTAA AGATTCAAGC AATTTAATCC AAATTTAATC CAAAACCCAA ACAAATGGAT 7980 

TCAGGGTAGG ACACCACTTA CAAATCCAA 8009 

40 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 60 
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AGATGAATGC 


TAACCATATT 


CATTCTGCTA AAGATGGTCG 


TGTTACTGCG 


ACAGCTGAAA 


ieo 




TTATTCATCG AGGTAAGTCG ACACATGTAT GGGATATAAA AATTAAGAAT GACAAAGAAC 


240 




AATTAATTAC 


AGTTATGCGT 


GGTACAGTTG 


CTATTAAACC 


TTTAAAATAA 


AAGAACTGCT 


300 




AGCTGAAATG 


TTATGAGATA 


TTCATAACTA 


CGGCTAGCAG 


TTTTTTTATG 


CGCTATATTG 


360 




TTGTAGTTTT 


AGAAATGCTT 


GTTCAATGCG 


TTCGGCAGCT 


TTACGGCCAC 


CCATAACATT 


420 


70 


TCTACCAAAT 


GGTCCTAATT 


CTAAGTCTGC 


AAAGCATCCT 


GCGACAAATA 


GATTTGGTAT 


480 




CCATTCTAAT TTTTCGGAAA TAACAGGGTA ATTACATTCG TTGATAGGTG CATCATAATT 


54 0 


75 


1 XVjXAX XAAX 


TGCTTAATAA GTGGTTGTGA 


CATAAAATCT 


TGTTCAAAAC 


CAGTTGCAAC 


600 


P A T A A TPTY2 T 
L-ft 1 AA X V. Xvj A 


TGATATGGAA 


CAGAATCATT 


TTCAGTGTTA 


ATTACACCAC 


CACTAATTTG 


660 




AG 1 vaA 1 A\j\j X 


GTTTTATGCa 


CATTTATACG 


ACCATTTTTA 


ATATGTTTTT 


TAAGGCGTAA 


720 


20 


PTAPAP.T-TPP 


TPIV J\ T"TV' 


ATCCTTTATG 


ACGTTCGCGT 


TGTACAATGG 


CATTTCTTTC 


780 




Afi/IPATPPTT 


X XAGXAC X X A 


AAAATGAAGA 


CATATTTTtC 


GGACCTAACC 


AACCAGGATC 


840 




AfSPATP A a AP 


T*f"*a TV™"! 1 7k T"P*P 
1 \—n L\s X A X X X 


CAATATCTTT 


ATTTAGCCAT 


AAATGAATCT 


TTTTATCGTT 


900 


25 




A A P & 71 TTT a A 


G XTjLAAGATG 


TGCAGCAGTa 


ATGCCGCTAC 


CAACGATATG 


960 




ATP£"JfiTPTTA 


TPATATAPTA 


PTTV2 & TP A R r» 


TTCTTTCTCG 


AAGATATGAT 


TTACATTCTG 


1020 




TTTGTCTTTT 


AAAATTVTPAP. 


fZPATA A AP/T1P 


AAXAX X X\»XA 


U GCCTATTG 


CAATAACGAC 


1080 


■ 30 


GCAATCTGTA 


GTGATAATTT 


GTCCATCTTC 


X AAL. X 1 GA X A 


TGCCATTTGT 


CTTCTTGTTT 


1140 




ATCTAAAGTT 


TGAACTAAAC 


CTTGAACCAA 


GCAATCCTCT 


AATTGATATT 


GTTTAGAAGC 


1200 




ATGTGCAATA 


TGATCCATAA 


ACATTGTCAA 


TTCAGGTCGT 


TGATAAGGAC 


CATAAAAAGC 


1260 


35 


ATTTGTATAT 


TGGTGCTGTT 


TAGCGAATTG 


TTTTAGATGG 


AACGGTTGTG 


GATGTACGTG 


1320 




ATGTACAATC 


GGTGATCTTA 


AATAAGGCAT 


TTCTATTCGA 


TTTGTATATG 


AGTTAAACCT 


1380 


40 


TTGGCAAAAA 


GTTTCGTGTG 


GGTCAATGAT 


TGTTAATCGG 


TCTGTTGTTA 


ATCCGCTTGA 


1440 


TAATAGTTTT 


TGTGCGATTG 


CAGTTCCCTG 


TATGCCACCG 


CCGATAATTG 


TCCAATGCAT 


1500 




AATAAAACCT 


CTCTCTTTTT 


AAAACGTAAT 


AGTTACGATT 


TATAATTATT 


ATTATCATAA 


1560 


45 


TACATAACGA 


CATGAAAGGC 


AATTAAATTA 


AAGAGATATA 


TGTAGATAGG 


GCGAATCTGT 


1620 




AGTCAAAGAA 


AAAATCATTG 


AAAAAGAGGT 


AACAATGTCA 


AAAGAwAACA 


GCAGTAAAAT 


1680 




CATTCCTAAT 


TTGGAATCAT 


CTTACTGCTG 


TTTGTTGTTG 


ATTTATATTC 


ATGATTTTGT 


1740 


50 


TATATAATCT 


ACAATTTTGT 


GTCTTTTAAG 


TCTTCCGAAA 


TTTCATCGAC 


TTTAGTCTTT 


1800 




TTAGTATAAG 


GCGTTTTAAT 


ATTATATGCT 


GCTTTCATAA 


TCATATGACT 


TGAAAGAGGA 


1860 
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GCAATAAAAT ATAAAAACGT ACCAAATAGT AATGACATTG CACCTAATGT TGATGCTTTT 1980 

CCGGCAGCAT GTGCACGTGA ATATACATCT TCAAGTCTCA ATAATCCTAT AGCTGCTAGG 2040 

5 GCGCTAATTA AAGCACCGAT GATAACAAAG ATAAGTGCAA GACTAATCAG TATGATTTTG 2100 

ATCATGTTCA ATCACCTTAC CTTTGTCCAT AAATTTAGAG AATACTGCAG TACCTAAAAA 2160 

AGCTAATATA CCAATCATCA TAATAACGAC AATCATGTAT TTAATATTTA ATAAAATACT 2220 

10 

GAATAATGCT ATAACTGCCA TTAATTGAAG ACCAATCGCA TCTAATGCGA CAACACGATC 2280 

GGCAAGTGAT GGGCCTAGCA CAACGCGAAT GAG CAT AG CT AACATAGAAA TGACAACTAT 2340 

GATTAATGCA ATAACGATAA TAACATTATG ATTCATTATA TTTCGCCCAC CTCTCTTACA 2400 

15 

ATTTTCTCTA ATGATGTTTT AATACTTTCT ACTTCTTGCT CTTTAGTTGA AAAATCTATG 2460 

GCATGAATAT AAATTTTTGT ACGATCGTCA CTTACACCAA GCACTACAGT ACCAGGTGTT 2520 

2Q AATGTAATTA AATTAGACAG CAAGACAATT TGCCAATCTT TTTTTAAATC TGTGTGATAA 2580 

ACAAAGAATC CTGGTTCATT TTTAATCGAA GGTTTAATAA TAATTTTCAA AACATCAAAA 2640 

TTAGCTTTAA TCAGTTCGAT TAAGAAAATA ATAACTAATT TAATAATACG ATATAGCGTG 2700 

25 ATGACATAAA ATCTACCTGG TAACACTCTG TGTAAGAGGT AAACAAGAAC TAGGCCAAAG 2760 

ATGAAACCTA ACACAAAGTT ATTTGTTGTG TAACTATTTG TCACAAACAA CCAAAACACT 2820 

GCGATAATAA AGTTTAATAC TAATTGTACA GCCATGTTAT TTACCTCCTA ATACAGCTTT 2880 

30 AACGTAGGTT GATGGATTGT AGAATGTTTC TGCACCAGCT TTTACCATTG GATATAAGTA 2940 

ATCTGCTGAC AATCCATATA AAACAGTTAT CACAACTGCA ACGATTG CAA TCGTAGTTAA 3000 

ATATTTGACG TCGACTTTGT TATTAAGATC ATATCCTTTT GGTTGACCGA AAAAGCCTTG 3 060 

35 

TAGGAATATG CGAATGACAG AATATAATAC GACTAAACTT GATAATAAGA CGATGACACC 3120 

ACTTAAATAA AATCCTCTTT CAAATGTTGA TTGGACAATA AAAAATTTTC CATAAAAGCC 3180 

ACTGAGTGGG GGAATGCCAG CTAAACTTAA TGCTGCGATA AAGAATGACC AACCAAGTAC 3 24 0 

40 

AGGATATCGT TTAATTAAGC CACCAAATTG TCTTAAATCA GCAGTGCCTG TAATTTTAAT 3300 

CATAATTCCG ATAAGCAAGA ATAATGCAAG TTTTACTAAC ATGTCGTGCA ATGTATAGTA 3360 

4$ AATAGCCCCA ATCATACCTG ACTCTGTCAT CATTGCAACG CCGACTAAGA TCACACCTAC 3420 

AGCAATCATG ACATTGTATA GGATGATTTT TTTAATGTTG GCATATGCAA CAGCACCGAC 34 80 

ACAACCAAAG ATGATCGTTA ATAGTGCTAA GAATAAAATG ACATAATGTG AAAAGCTTAC 3 54 0 

50 ATTATCACTA AAGAATAGGC TCAATGTTCT AGCGATTGCA TAAACACCAA CTTTTGTTAA 3600 

CAAAGCACCA AAGAATGCAA TGATTGGAAT TGGTGGgCAT AGTATGCACT AGGTAACCAA 3660 
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ATATTGACTA AGCCACTGTC ATGCGCTGAA AGGTTAGCTA ATTTATTGCT TATATCTGCT 3780 

AGATTCAATG TTCCTACTAC TGAATATAAA ATCGCTACAC CCATTACGAA GAAGGATGAC 384 0 

5 GATACAACGT TAACAAGAAC ATATTTTATT GTTTCTTGTA GTTGAATTTT TGTAGAACCA 3 900 

ATTACTAATA AGAAATAAGA TGACATTAAA AATACTTCGA AAAATACGAA TAGGTTGAAA 3960 

ATGTCACCAG TTGTGAATGC ACCAATGATA CCTATTAACA TAAATAGTAC TGAAAAATAA 4020 

10 

TAATAATATC TTTCACGTTC AATACCAATT GTTTGGTATG AATATAAAAT CACAATAGCT 4080 

GTAATAATAA TACTAGTAAT TATTAGTAGG GCACTGAATA TGTCTAATAC AAAGACAATA 4140 

CTGTATGGTG CTTTCCATGA ACCTAGCTCT ACGCGTATTG GTCCATGTTT AACAACATTT 4200 

15 

GCTAAATTGA TAATTGCCGC GACCAAGGTT AATAATGTAC CGCCTAGTGC GACATAACGC 4260 

TTTATAATAG GACGCTTTCC AATAAAGACA AGTAATATGG CTGTAATTAC TGGAATAACT 4320 

2Q AGCGTTAACA CAAGCATATT ACTTTCAATC ATCTTCTGGA ACTCCTTTCA TACTCTCAAC 4380 

GTTATCTGTG CCTAATTCTT TATATGTTCT AAATGCTAAT ACTAAGAAAA AGGCTGTTGT 444 0 

CGCAAgGCGA TAACGATTGC TGTTAAAATA AGTGCTTGCG GGaTAGGaTC AACATAGCTT 4 500 

25 TTTACGTTCG CTTCATAAAT TGGAACAGTA CCATGTTTAA GTCCGCCCAT AGTTATTAAA 4 560 

AATAAATTTG CTGCATGTGT TAATAGTGTA GTTCCCATAA CAATTCGTAT CAGACTTTTA 4 620 

GACAAAACGA GATAGACACT AATTGCTGTG AGAATACCAC TAACAAAAAT CATAATAATT 4 680 

30 TCCACTATTC GTTCTCTCCA ATCGAAATAA TAATTGTCAT GACAGTACCA ACTACTGCAC 474 0 

ATAAAACACC GAAATCAAAG AATACTGCTG TTGTCATATG AACAGGTTCT AATATAAATA 4 800 

ACGGTATATC AAATGTGACA TGCGTAAAGA AATTTTTGCC TAAAAACCAA CTTGCGATAG 4 86 0 

35 

GCGTCGCAAT ACAAAAAACT AATCCGATAC CTATCAAGAT TTTAAAATCT AATGGGAAAA 4 920 

TTTTACGCAT TGTTTCTATA TCAAATGCAA TCGTAATGAT AACAAGTGAA CTTGCGAATA 4 980 

ATAATCCGCC GACGAAACCG CCACCAGGTG TATAATGTCC TGCTAAGAAA AGTGAAAAAC 5040 

40 

CAAAGACCAT TACCATGAAA AAGATAATAA CTGCAGCAAA TTGCAAAATT AGATCATTTT 5100 

GTTGTCTATT CATGATTTTT CACCTCGTTA CCTTGCGTTT GACGCTTTTT ACGTAATTTA 5160 

4S ATCATTGTAT ATACAGCTAA TCCTGCGATA CCAAGCACAG ATGACTCGAA TAAAGTATCC 5220 

ATACCACGGA AATCAACAAG TATGACGTTT ACCATGTTTT TACCGTGAGC tAAATCATAA 5280 

ACGTGCTCTT GATAAAACTT AGATATCGAT TCAAAATGTC TATTTCCGTA TGCAATTAAA 534 0 

50 CCGATAATAA TGACGGACAA ACCAACACCA CCAGCAATTA AAGCATTAGT AAGCTGGAAT 54 00 

GAGCGCTTTT CATTATAACG ATTTAAATTT GGTAAGTGGT AGAAGCATAA TAAGAACAAT 5460 
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ATAAACAATA 


CAGACACAGC 


ATATCCAACT 


GCACTTAACA 


TAATGATGCT 


AAATAATCTT 


5580 




GATTTAGCGA 


AAAGAATTAA 


AAAGGCAGCA 


CTTAATAATA 


AAATTACGAT 


ACAAACTTCG 


5640 


5 


AAAATTCTAA 


TCGGACTAAC 


GTCTTTAAAA 


TTAATGTTGA 


AAGGTACTGA 


GAATATAGTG 


5700 




ACAAATGTTA ATAAAATTAA 


TGCACCAAAA 


ATGATAACTA 


AATTATTACG 


TGAATAATCG 


5760 


10 


GTAACATAGC 


TATTCGTCAT 


CTTTTCAGAG 


TAGTTTGGAA 


TAACATTTGC 


ACTTCTGTTG 


5820 


TACCAATAAT 


TGAATGTTAG 


TTTACCAGGT 


TGTCGTTGCA 


ACAATTTCAC 


CCAATAACTA 


5880 




AATGTCACAA 


TTAGTAAGAT 


ACCTAAAATA 


TAAATCACTA 


ATGTTGATAA 


AAAGGCAGGC 


5940 


15 


GTTAATCCAT 


GGAACATATG 


GAATTCAACA 


TCATCAATTA 


CCGTATGATT 


AATCGAAGag 


6000 




TnAGCTGGTT 


CAATAATCGA 


ATTAGTTAAA 


ATGCCAGGGA 


ATAAACCAAA 


TACAATTACT 


6060 




AATGTAGCTA AAATAGCTGG 


TGATAAAAGC 


ATTAATATTG 


ATACTTCGTG 


TGCTTTTTTA 


6120 


20 


GGTAATTGTT 


CAGGTTTATA TTGTCCGAAA AATATATGCA 


TTATAAATTT 


AATTGAATAT 


6180 




ACAAATGTGA 


AGACACTGCC 


CACTATACCA 


ATGATTGGGA 


ATAGGTAGCC 


TAATGTATCA 


6240 




ACACTGAATA 


AATTTGCTTG 


GCTTGCTGTA 


AATGTTGTTT 


CTAAAAATGA 


TTCTTTTGAT 


6300 


25 


AAGAAACCAT TGAACGGTGG TACACCAGCg CATACTTAAT 


GCTGTAATAA 


CAGTGATTGT 


6360 




AAATGAAATA 


GGCATAATTG 


TTAGTAAGCC 


ACCTAATTTC 


TTAACATCAC 


GTGTACCAGT 


6420 




AGAATGATCC 


ACTGCACCTG 


TAATCATAAA 


TAGGGCACCT 


TTAAATGTTG 


CATGGTTGAT 


6480 


30 


TAAATGGAAT 


ATTGCAGCCG 


TAAATGCAGC 


AGCATATATT 


TTGCTATCAT 


CGCCTTGATA 


6540 




GTGATAACTA 


ATGGCACCGA 


TTCCAAGCAT 


CGCCATAATC 


ATACCTAATT 


GGGATACTGT 


6600 


35 


TGAAAATGCC 


AGTATACCTT 


TCAAGTCTTG 


TTGTTTTGTT 


GCGTTTAGCG 


AAgCCCAGAA 


6660 


TAATGTAATT 


AAACCAACGA 


GTGTGACAGT 


CCATACCCAA 


CCTTGCGATG 


CTGCGAAGAT 


6720 




TGGTGTCATT CGAGCGATTA AATATAACCC TGCTTTAACC 


ATTGTTGCTG 


AATGAAGATA 


6780 


40 


AGCACTGACT 


GGTGTAGGTG 


CTTCCATTGC 


ATCTGGTAGC 


CAAATATAAA 


ATGGAAACTG 


6840 




AGCAGATTTT 


GTAAAAGCAC 


CAATCATGAT 


TAAAATCATC 


GCAAAAATGA 


AGAATGGGCT 


6900 




ATTTTGAATT 


TCAGAAGCAT 


GTTGAATCAT 


GTACTGAATG 


CTAAATGATT 


GTGTTGGTAT 


6960 


45 


AGCGAGTAAG 


ATGATACCAC 


CTAATAATGA 


TAGACCACCA 


AATACTGTGA 


TTATGAGCGA 


7020 




TTTTTGAGCA 


CCATATATAG 


ATGCTTGTCG 


TTCGCGCCAG 


AATGAAATAA 


GTAAAAAACT 


70B0 




AGAAAATGAC 


GTTAGCTCCC 


AGAATAAATA 


TAGAATAATA 


ACATTATCTG 


AAAGTACGAC 


7140 


50 


ACCTAACATT 


GCACCCATAA 


ATAGTAATAA 


ATAACAATAA 


AAATTCCCTA 


GTTGTTCTGA 


7200 




CTTACTTAAG 


TAGCCGATTG 


AATATAATAC 


TACTAAACTG 


CCGATTCCTG 


AAATAAGCAA 


7260 
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CCAATTTAAG GTTTTCATTA CAGTATTACC TGACATCGTC GTTTTAATTA ATGTAAGCAT 73 80 

ATAAATAAAT ATGACGATAG GGACAGGTAA TACGAACCAT CCTAAATGTA TACGTTTAAA 7440 

5 AAATCTATAC AGGATAGGAA TAATGAGTGC GAATATTAAC GGTAATATCA CCGCAATATG 7500 

TAACAAACTC ACTATGTTGT CCTCCTTTAA AAAATATTTA TGTTATTCAT TATACATGAA 7560 

TGATATAGTT CTGAAAAACG TACACACTCC TTGTTGTGCT TTATTTTCAG AaGTATTTAA 7620 

10 

ATAAGAAGAA ACACGTCATT TTTTATTTAA AATTTTCTTT GTATTGAAGT GAATAATCTT 7680 
CTTTTAAGCG TGCTAAACTA GCTAAAGACA TTTCAGCATG TTTTGTTTGC TGAGCTTTAA ■ 7740 

GTTTAGTTTC TAAATCTGTA ATTGCTTGTT GAAGTGAATC TTCATAGCGC AATACATCAA 7800 

15 

CATTGAAGTC GCGTAATTGT GAACGTTTCG TATAGCGTTT TTCAAAATGG CTTAATGCTT 7860 

TGCGGTCATG GAAAAATACA CCTTCAGTTT CAGTAGGGTT ATGTAAATCA CCTTGTTTCG 7920 

2Q GGTGTTTGAT AACTTGTTCA ACTTTAACAA GGACATCGTC TCCATTTTCT TCAACAATCG 7980 

TGACACCATA GCTACCTGTT TTGTGTGAAA ATCGATATAG CTTCATGCTA TTTTCCTCCC 804 0 

TTAAAAGTAT GTTAATATAT ATGTATCATA ACATGAATGG AGAATATAAA TGGCTAACTA 8100 

25 TCCACAGTTA AACAAAGAAG TACAACAAGG TGAAATCAAA GTGGTTATGC ACACAAATAA 8160 

AGGTGACATG ACATTCAAAT TATTTCCAAA TATTGCACCA AAAACAGTTG AAAATTTTGT 8220 

GACACATGCA AAAAATGGTT ATTATGATGG AATCACATTC CACCGTGTCA TTAATGACTT 6280 

30 CATGATTCAA GGTGGCGATC CAACAGCTAC TGGTATGGGT GGCGAAAGTA TTTATGGCGG 834 0 

TGCTTTTGAA GATGAATTTT CATTAAATGC ATTTAACTTA TATGGCGCAT TATCAATGGC 84 00 

TAACTCAGGA CCTAATACTA ATGGTTCACA ATTTTTCATT GTTCAAATGA AAGAAGTACC 8460 

35 

TCAAAATATG TTAAGTCAAC TTGCAGATGG TGGCTGGCCT CAACCAATCG TTGATGCATA 8520 

TGGCQAAAAG GGTGGTACAC CATGGTTAGA TCAAAAACAT ACAGTATTCG GTCAAATCAT 8580 

TGATGGTGAA aCTACATTAG AAGATATTGC AAATACAAAA GTGGGACCAC AAGATAAACC 864 0 

40 

ACTTCATGAT GTTGTAATTG AATCTATTGA TGTTGAAGAA TAATATCTAA ACATAATTAA 8700 

CTACCAACAT TTTAAACTCG GATAAAGCTA ATTTATGAAT GGATTAGTAT ATATTCCAAC 8760 

45 gAAAATAAAT AAACTAATAT GATGAGCAAT CTCAATATAT TTATCaAGAA AGCACAGTTT 8820 

TTAAATAGAT GTGTATTTTA AAGATAATAG TTGAGGTTGC TTTTTATGTT TTTACAGAGA 8880 

ATTGCTATTC AAATAGTAAA TAAATTGAAA ACAAAGTAGC TGGATATCAT ATTGATTTAG 894 0 

50 ATAGGAATTT GTTGCTAATT TTATTTGTAA ATCCAAGTTT GTAGAATTCT TATTCATTTA 9000 

TAAAATAATA TTCGTATGAT TTGATTTTTT AATTAGTCCA CCATTTCGAT TTGTGCTATG 9060 
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T\ 7A /"'TV T TV I'/"' M » 


GGTGCGTGTA 


CTGGTATTCA 


ACCATACGGT 


GCGTTTGTTG 


AGACCCCTAA 


9180 




Tr'A T A rnv tv n 
1 VJA 1 Aw 1 \jAA 


GGACTGATTC 


ATATATCAGA AATTATGGAT 


GACTACGTTC 


ATAATTTGAA 


9240 


5 




TCAGAAGGCC 


AAATTGTTAA 


AGCTAAAATT 


TTGTCTATAG 


ATGATGAAGG 


9300 




AAAGCTTAAT 


CTATCATTAA AGGATAATGA TTACTTCAAA AATTATGAGC 


GTAAGAAGGA 


9360 




Tl TV TV Tk T*. IV ^H/^ Jl 

AAAACAATCA 


GTATTAGATG 


AAATCAGAGA 


AACAGAAAAA 


TATGGGTTTC 


AAACACTTAA 


9420 


10 


AGAACGCTTA 


CCAATCTGGA 


TAAAACAGTC 


AAAGCGAGCA 


ATTCGAAACG 


ACTAAAGGAA 


9480 




^ TV TV T< TV TV » 

CAGATAAATC 


GTACCGAAAA 


TCATACAAAG 


GGTCTGAAAT 


GAAAGTTTPT 


TAGACTATAA 


9540 


15 


aagagattag 


TATCTATTAA 


ATTTTATTAG 


t\± nv> iwii i, 




ACGATAACGT 


9600 


AATATGaTTG 


ATTCTATTTA 


CACGTACAAA 


7W1TTT A A OTl 


1 w\n\_n 1 /\ 1 w w 


ATTATCTTTG 


9660 




TTAGATAGAA 


TCGTTGATTT 


GCAATATTGT 

WXrWlX.nX XWX 


*41 V3 1 VJV3/11 1 X 


Oil lllllln 


TTTATTTTAG 


9720 


20 


AAATGAGAAC 


TACAACTTAA 


AGTATTAAAC 


GAATTGCAAP 


TATATAAAPA 


GATAATTGGA 


9780 




GAATGAAAAA 


ATTACATGTT 


ATAG TCAACT 


P A 21 T*A ATTTT 1 
V— rVrl X rvrV 1 111 


a Ann Arm a at 


TAAGTAATGA 


9840 




AAAGTAAATA 


CGAACCATTG 


TTTGATAAAG 


TAGAATTAPP 


AAATfSflAflTA 
/W* 1 WnvJin 


GAGTTGAGAA 


9900 


25 


ATCGATTTGT 


GTTAGCCCCT 


TTAACACATA 


TTT PTTP A A A 
x x x w x x w^inn 


1 VJrt 1 U/\l VJV5 1 


ACTATTTCAG 


9960 




ATGTAGAACT 


TCCTTATATT 


GAAAAGCGTT 


CACAAGATGT 


TGGTATTACA 

X w\J A X X ^%w^l 


ATTAATGCTG 


10020 




CGAGTAATGT 


GAGTGATGTC 


GGAAAAGCAT 


TTCCAGGACA 


GCCATCAATC 


GCGCATGACA 


10080 


30 


GTAATATTGA 


AGGACTAAAA 


CGATTAGCTA 


CAGCAATGAA 


GAAAAACGGT 


GCCAAAGCAC 


10140 




TCGTACAAAT 


ACATCATGGC 


GGTGCACAAG 


CATTGCCTGA 


ATTAACACCT 


GATGGAGACG 


10200 




TCGTAGCACC 


AAGTCCAATT 


TCTTTAAAAA 


GTTTTGGTCA 


GAAACAAGAA 


CATAGTGCTA 


10260 


35 


GAGAAATGAC 


GAATGAAGAG 


ATTGAACAAG 


CAATCAAGGA 


TTTTGGTGAA 


GCAACGCGAC 


10320 




GTGCAATTGA AGCAGGGTTT 


GATGGTGTTG 


AAATACATGG 


CGCGAATCAT 


TACTTAATTC 


10380 


40 


ATCAATTTGT 


ATCACCATAC 


TATAATAGAA 


GAAATGATGT 


ATGGGCAAAT 


CAATATAAAT 


10440 


TCCCGGTCGC 


TGTGATTGAA 


GAAGTACTTA 


AAGCGAAAGA 


AGCGTATGGC 


AATAAAGACT 


10500 




TTATAGTTGG 


ATACAGATTA 


TCTCCAGAGG 


AAGCGGAGTC 


TCCAGGAATC 


ACAATGGAAA 


10560 


45 


TTACAGAGGA 


ACTCGTTAAT 


AAAATTAGCC 


ATATGCCAAT 


CGACTATATT 


CATGTTTCAA 


10620 




TGATGGATAC 


GCATGCAACG 


ACACGTGAAG 


GTAAATACGC 


TGGACAAGAA AGACTGCCTT 


10660 




TAATTCACAA 


ATGGATAAAT 


GGTCGTATGC 


CACTTATCGG 


TATTGGTTCA 


ATTTTCACAG 


10740 


50 


CTGACGAAGC 


TTTAGATGCA 


GTTGAAAATG 


TTGGTGTTGA 


CTTAGTAGCC 


ATTGGTAGAG 


10800 




AGCTACTACT 


GGATTATCAA 


TTTGTTGAAA 


AAATTAAAGA 


TGGACGGGAA 


GATGAAATTA 


10860 
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w 



15 



30 



35 



AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 10953 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

TTTGATAnAA AACTGAATnA ATTAAATGTA TCGATTCAAC CTAATGAAGT GAATTTACAA 60 

GTTAAAGTAG AGCCTTTTAG CAnAAAGGTT AAAGTAAATG TTAAACAGAA AGGTAGTTTA 120 

GCAGATGATA AAGAGTTAAG TTCGATTGAT TTAGAAGATA AAGAAATTGA AATCTTCGGT 180 

20 AGTCGAGATG ACTTACAAAA TATAAGCGAA GTTGATGCAG AAGTAGATTT AGATGGTATT 24 0 

TCAGAATCAA CTGAAAAGAC TGTAAAAATC AATTTwCCAG AACATGTCAC TAAAGCACAA 300 

CCAAGTGAAA CGmAGGCTTA TATAAATGTA AAATAAATAG CTAAATTAAA GGAGAGTAAA 360 

25 CAATGGGAAA ATATTTTGGT ACAGACGGAg TAAGAGGTGT CGCAAACCAA GAACTAACAC 420 

CTGAATTGGC ATTTAAATTA GGAAGATACG GTGGCTATGT TCTAGCaCAT AATAAAGGTG 480 

AAAAACACCC ACGTGTACTT GTAGGTCGCG ATACTAGAGT TTCAGGTGAA ATGTTAGAAT 54 0 

CAGCATTAAT AGCTGGTTTG ATTTCAATTG GTGCAGAAGT GATGCGATTA GGTATTATTT 600 

CAACACCAGG TGTTGCATAT TTAACACGCG ATATGGGTGC AGAGTTAGGT GTAATGATTT 660 

CAGCCTCTCA TAATCCAGTT GCAGATAATG GTATTAAATT CTTTGGATCA GATGGTTTTA 720 

AACTATCAGA TGAACAAGAA AATGAAATTG AAGCATTATT GGATCAAGAA AACCCAGAAT 780 

TACCAAGACC AGTTGGCAAT GATATTGTAC ATTATTCAGA TTACTTTGAA GGGGCACAAA 840 

AATATTTGAG CTATTTAAAA TCAACAGTAG ATGTTAACTT TGAAGGTTTG AAAATTGCTT 900 

40 

TAGATGGTGC AAATGGTTCA ACATCATCAC TAGCGCCATT CTTATTTGGT GACTTAGAAG 960 

CAGATACTGA AACAATTGGA TGTAGTCCTG ATGGATATAA TATCAATGAG AAATGTGGCT 1020 

45 CTACACATCC TGAAAAATTA GCTGAAAAAG TAGTTGAAAC TGAAAGTGAT TTTGGGTTAG 1080 

CATTTGACGG CGATGGAGAC AGAATCATAG CAGTAGATGA GAATGGTCAA ATCGTTGACG 1140 

GTGACCAAAT TATGTTTATT ATTGGTCAAG AAATGCATAA AAATCAAGAA TTGAATAATG 1200 

50 ACATGATTGT TTCTACTGTT ATGAGTAATT TAGGTTTTTA CAAAGCGCTT GAACAAGAAG 1260 

GAATTAAATC TAATAAAACT AAAGTTGGCG ACAGATATGT AGTAGAAGAA ATGCGTCGCG 1320 
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CTGGTGATGG TTTATTAACT 


GGTATTCAAT TAGCTTCTGT 


AATAAAAATG 


ACTGGTAAAT 


1440 




CACTAAGTGA ATTAGCTGGA 


CAAATGAAAA AATATCCACA 


ATCATTAATT 


AACGTACGCG 


1500 


5 


TAACAGATAA ATATCGTGTT GAAGAAAATG TTGACGTTAA AGAAGTTATG 


ACTAAAGTAG 


1560 




AAGTAGAAAT GAATGGAGAA 


GGTCGAATTT TAGTAAGACC 


TTCTGGAACA 


aACCATTAGT 


1620 




TCGTGTCATG GTTGAAGCAG 


CAACTGATGA AGATGCTGAA 


aGATTTGCAC 


AACAAATAGC 


16B0 


10 


TGATGTGGTT CAAGATAAAA 


TGGGATTAGA TAAATAAATA 


CTGTATTACA 


AATGAGCCGA 


1740 




TGCGTATGcA nTcgtTTTTT GTGTTTGTAG AAATAATTTA TAGTACAAAC 


GTAAAATGAT 


1800 


15 


ITU 21 7H**B. A IV ZL HP&R. SB. » S 


GTAATCAATA TGTAATATAA 


AATACACTGG 


TACTCAATAT 


1860 




AAiAjHlliA i/VAAAl IAA1 


TTTAATTAGA TAGAGTTGCT 


TTGTGTTTTT 


AACGCAGATG 


1920 




CTACTACTTA TCTTAACAGT 


TGATTAAGTG AAATCATTTA 


ACAGCGAGAA 


TAATCAACCA 


1980 


20 


GGAGGATGAC TTAATGAATT 


TATTCAGACA ACAAAAATTT AGTATCAGAA 


AATTTAATGT 


2040 




CGGTATTTTT TCAGCTTTAA 


TTGCCACTGT TACTTTTATA 


TCTACTAACC 


CGACAACAGC 


2100 




GTCTGCAGCA GAGCAAAATC 


AGCCTGCACA AAATCAACCA 


GCACAACCAG 


CTGATGCCAA 


2160 


25 


TACACAGCCT AACGCAAATG 


CTGGTGCTCA AGCTAATCCT 


ACAGCACAGC 


CAGCTGCACC 


2220 




TGCCAACCAA GGACAACCAG 


CAGTACAACC AGCAAACCAA 


GGTGGACAGG 


CTAATCCAGC 


2280 




AGGAGGAGCA GCACAACCAA 


ATACACAACC AGCTGGACAA 


GGTGATCAAG 


CTGATCCGAA 


2340 


30 


TAACGCTGCA CAAGCACAAC 


CTGGAAATCA AGCAACACCG 


GCAAACCAAG 


CAGGTCAAGG 


2400 




AAATAACCAA GCAACACCTA 


ATAATAATGC AACACCGGCA 


AATCAAACAC 


AGCCAGCGAA 


2460 


35 


TGCTCCAGCA GCAGCGCAAC 


CAGCAGCACC TGTAGCAGCA 


AACGCACAAA 


CTCAAGATCC 


2520 


AAATGCTAGC AATACTGGTG 


AAGGCAGTAT TAATACGACA 


TTAACATTTG 


ATGATCCTGC 


2580 




CATATCAACA GATGAGAATA GACAGGATCC AACTGTAACT GTTACAGATA 


AAGTAAATGG 


2640 


40 


TTATTCATTA ATTAACAACG 


GTAAGATTGG TTTCGTTAAC 


TCAGAATTAA 


GACGAAGCGA 


2700 




TATGTTTGAT AAGAATAACC 


CTCAAAACTA TCAAGCTAAA 


GGAAACGTGG 


CTGCATTAGG 


2760 




TCGTGTGAAT GCAAATGATT 


CTACAGATCA TGGTAACTTT 


AACGGTATTT 


CAAAAACTGT 


2820 


45 


AAATGTAAAA CCAGATTCAG 


AATTAATTAT TAACTTTACT ACTATGCAAA 


CGAATAGTAA 


2880 




GCAAGGTGCA ACAAATTTAG 


TTATTAAAGA TGCTAAGAAA 


AATACTGAAT 


TAGCAACTGT 


2940 




AAATGTTGCT AAGACTGGTA 


CTGCACATTT ATTTAAAGTA 


CCAACTGATG 


CTGATCGTTT 


3000 


50 


AGATTTACAA TTTATTCCTG 


ACAATACAGC AGTTGCTGAT 


GCTTCAAGAA 


TTACAACAAA 


3060 




TAAAGATGGT TATAAATACT 


ATTCATTCAT TGATAATGTA GGTCTATTCT 


CAGGATCACA 


3120 
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TAATACTGAA 


ATCGGTAACA 


ATGGTAATTT 


TGGTGCTTCA 


TTAAAAGCAG 


ATCAATTTAA 


3240 




ATATGAAGTA 


ACATTACCAC 


AAGGTGTAAC 


TTACGTTAAT 


AATTCATTAA 


CTACAACATT 


3300 


5 


CCCTAATGGT 


AATGAAGACA 


GTACAGTATT 


GAAAAATATG 


ACTGTTAATT 


ATGATCAAAA 


. 3360 




TGCAAATAAA 


GTTACATTTA 


CAAGCCAAGG 


TGTGACAACG 


GCACGTGGTA 


CACACACTAA 


3420 




AGAAGTTTTA TTCCCAGATA AATCTTTAAA ATTATCATAT AAAGTTAATG TTGCGAATAT 


3480 


10 


CGATACACCT 


AAAAATATTG 


ATTTTAATGA AAAATTAACA 


TATCGTACTG 


CTTCAGATGT 


3540 




Tf*TA ATT A JIT* 


AATGCGCAAC 


CAGAAGTaCA 


CTAACTGCAG 


ATCCATTTTC 


AGTAGCGGTT 


3600 


15 


f3AAATf:AAPA 


AAGATGCGTT 


GCAACAACAA 


GTAAACTCAC 


AAGTTGATAA 


TAGTCATTAC 


3660 


APLAPAC2PAT 
AlwA/\V_AG Ui 1 


CAATTGCAGA ATACAATAAA 


CTTAAACAAC 


AAGCAGATAC 


TATTTTAAAT 


3720 






A I UATGTTAA 


T\ TV / fill/"* ^ ll » » rri 

AACTGCAAAT 


CGTGCATCTC 


AAGCGGATAT 


TGATGGTTTA 


3780 


20 


fVPAAPTA a AT 


1 At-AAGCTGC 


ATTAATTGAT 


AATCAAGCAG 


CAATTGCTGA 


ATTAGATACT 


3840 






a a & a /VTT ft /"■ 


AGGAG CACAA 


CAAAGTAAAA 


AAGTTACGCA 


AGATGAAGTT 


3900 




rcp anr apt*ty2 


innnnl 


I AAC AATGAT 


AAAAATAATG 


CAATCGCAGA 


AATTAATAAA 


3960 


25 


PAAAPTAPAfS 


PAPA fiil/lTPT 


pr pa arTP* r 
LALAAL 1 GAA 


AAAGATAATG 


GTAT CGCAGT 


GTTAGAACAA 


4020 




RATfSTCSATTA 


PAPPAAPAPT 


*ra i appTpa a 


G AAAGAAG 


TV rp *v mm « rpf*i /■» * 

AT ATT ATC CA 


AGCAGTTACA 


4080 




ACTCGT AAA p 


AACAAATTAA 


AAAGTCAAAT 


p p a tp a TT a c* 
GL-Axv_Ax xAl. 


AAGATGAAAA 


AGATGTAGCA 


4140 


30 


AATGATAAAA 


TTGGTAAAAT 


TGAAACAAAG 


pp& tTTa a ap 


AxAi I/GAxGG 


AGCAACAACA 


4200 




AATGCACAAG 


TAGAAGCCAT 


TAAAACAAAA 


a a TPA & TP 
(JLAnl LAA1 


at att a rtv^r 
AlAi lAAluA 


AACTACACCT 


4260 




GCTACAACAG 


CTAAAGCAGC 


AGCTCTTGAA 


GAATTTGACG 




AG uAuAAA 1 1 


4320 


35 


GATCAAGCAC 


CTTTAAATCC 


TGATACAACA 


AATGAAGAAG 


TAGCGGAAgC 


TATTGAACGT 


4380 




ATTAATGCAG 


CTAAAGTTTC 


TGGTGTTAAA 


GCAATTGAAG 


CGACAACGAC 


TGCACAAGAT 


4440 


40 


TTAGAAAGAG 


TTAAAAACGA 


AGAAATCTCA 


AAAATTGAAA 


ATATTACTGA 


CTCTACGCAA 


4500 


ACAAAAATGG 


ATGCCTATAA 


TGAAGTTAAA 


CAAGCTGCAA 


CAGCTAGAAA 


AGCTCAAAAT 


4560 




GCTACAGTTT 


CAAATGCAAC 


AAATGAAGAA 


GTAGCAGAAG 


CTGATGCAGC 


AGTAGATGCA 


4620 


45 


GCTCAAAAGC 


AAGGTTTACA 


TGACATCCAA 


GTTGTTAAAT 


CAAAACAGGA 


AGTTGCTGAT 


4680 




ACAAAATCAA 


AAGTATTAGA 


TAAAATCAAT 


GCAATTCAAA 


CACAAGCAAA 


AGTTAAACCT 


4740 




GCAGCTGATA 


CGGAAGTAGA 


AAACGCATAT 


AATACACGTA 


AACAAGAAAT 


TCAAAATAGC 


4800 


50 


AATGCTTCAA 


CTACAGAAGA 


AAAACAAGCT 


GCATATACAG 


AATTAGATAC 


TAAAAAGCAA 


4860 




GAAGCAAGAA 


CAAATCTTGA 


TGCTGCAAAT 


ACAAACAGTG 


ATGTAACAAC 


AGCTAAAGAC 


4920 
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GCGGAAATCG CTCAAAAAGC AAGTGAACGT AAAACAGCAA TTGAAGCAAT GAATGATTCG 5040 

ACTACTGAAG AACAACAAGC AGCGAAAGAC AAAGTGGATC AAGCAGTAGT TACTGCAAAC 5100 

5 GCTGATATAG ATAATGCTGC AGCAAACAAT GATGTGGATA ATGCAAAAAC TACAAATGAA 5160 

GCTACAATCG CAGCCATTAC ACCTGATGCA AATGTTAAAC CAGCAGCAAA ACAAGCAATT 5220 

GCAGATAAAG TACAAGCTCA AGAAACAGCA ATTGATGGAA ATAACGGCTC AACAACTGAA 5280 

10 

GAAAAAGCAG CTGCTAAACA ACAAGTTCAA ACTGAAAAAA CAACAGCTGA TGCCGCAATA 5340 

GATGCAGCAC ATACAAATGC GGAAGTTGAA GCGGCTAAAA AAGCAGCAAT TGCTAAAATT 5400 

GAAGCGATTC AGCCAGCAAC AACAACTAAA GATAATGCGA AAGAAGCAAT TGCTACGAAA 5460 

15 

GCGAATGAAC GTAAAACAGC AATCGCTCAA ACGCAAGACA TTACTGCTGA AGAAATTGCA 5520 

GCGGCTAATG CGGACGTAGA TAATGCTGTG ACACAAGCAA ATAGCAACAT TGAAGCTGCT 5580 

AATAGTCAAA ATGATGTAGA CCAAGCGAAA ACGACAGGTG AAAATAGTAT TGATCAAGTA 5640 

ACACCAACAG TTAATAAAAA AGCAACTGCA CGTAATGAAA TCACAGCAAT TTTAAATAAC 5700 

AAATTGCAAG AGATTCAAGc tACGCCAGAT GCAACAGATG AAGAAAAACA AGCAGCTGAT 5760 

25 GCTGAAGCAA ATACTGAAAA TGGTAAAGCA AATCAAGCCA TTTCAGCAGC AACTACTAAC 5820 

GCACAAGTTG ATGAAGCTAA AGCAAATGCA GAAGCAGCGA TTAATGCGGT AACACCAAAA 5B80 

GTTGTGAAGA AACAAGCGGC TAAAGATGAA ATTGATCAAT TACAAGCAAC GCAAACAAAT 5940 

30 GTTATCAATA ATGATCAGAA CGCTACAACA GAAGAAAAAG AAGCAGCTAT TCAACAATTA 6000 

GCAACAGCAG TTACAGACGC GAAAAATAAT ATTACAGCTG CAACTGATGA TAATGGTGTA 6060 

GATCAGGCGA AAGACGCTGG AAAGAATTCA ATTCAAAGCA CGCAACCAGC AACAGCGGTT 6120 

35 

AAATCAAATG CTAAAAATGA TGTTGATCAA GCTGTGACAA CTCAAAATCA AGCAATTGAT 6180 

AATAGAACTG GTGCTACAAC TGAAGAGAAA AATGCAGCAA AAGATTTAGT TTTAAAAGCT 6240 

AAAGAAAAAG CGTATCAAGA TATCTTAAAT GCACAAACAA CTAATGATGT TACGCAAATT 6300 

40 

AAAGATCAAG CAGTTGCTGA TATTCAAGGT ATTACTGCAG ATACAACAAT TAAAGATGTT 6360 

GCGAAAGATG AATTAGCAAC AAAAGCAAAC GAACAAAAAG CGCTTATTGC ACAAACTGCA 6420 

GATGCGACTA CTGAAGAAAA AGAACAAGCA AATCAACAAG TAGACGCACA ATTAACACAA 648 0 

45 

GGTAATCAAA ATATTGAAAA TGCACAGTCA ATCGATGATG TAAACACTGC AAAAGATAAT 6540 

GCAATTCAAG CAATTGACCC AATTCAAGCA TCAACAGATG TTAAAACGAA TGCAAGAGCG 6600 

SO GAATTGCTAA CTGAAATGCA AAATAAAATA ACTGAAATAC TTAATAATAA TGAGACTACT 6660 

AATGAAGAAA AAGGTAACGA TATTGGACCA GTTAGAGCAG CATATGAAGA AGGTTTAAAT 672 0 
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A A AfVTTP A A P 


AACTTCATGC 


AAATCCTGTT 


AAGAAACCAG 


CAGGTAAAAA 


AGAATTAGAT 


6840 




CTGATAAGAA 


AACACAAATA 


GAACAAACAC 


CAAATG CAT C 


ACAACAAGAA 


69.00 


ATTAATftATn 


CAAAACAAGA AGTTGATACT 


GAATTAAATC 


AAGCGAAAAC 


AAATGTCGAT 


6960 


PAATPATPAA 


CAAATGAATA 


TGTTGATAAT 


GCAGTTAAAG 


AAGGAAAAGC 


TAAAATTAAT 


7020 




CATTTAGTGA 


GTACAAAAAA 


GATGCTTTAG 


CTAAAATTGA 


AGATGCATAT 


7080 


AATflPTAfc Afl 


TAAACGAAGC 


GGATAACTCT 


AACGCATCGA 


CTTCAAGTGA 


AATTGCTGAA 


7140 


ppfiAA ap&aja 


AACTTGCTGA 


ATTAAAACAA 


ACTGCGGATC 


AAAATGTTAA 


TCAAGCTACT 


7200 




ACATTGAAGT 


TCAAATTCAT 


AATGACTTAG 


ATAATATTAA 


CGATTACACA 


7260 


AHl CAAGAG 


GTAAAAAAGA 


ATCAGCTACA 


ACAGATTTAT 


ATGCTTATGC 


AGATCAGAAG 


7320 


AAAAATAATA 


TTTCAGCTGA 


CACTAATGCA 


ACACAAGATG 


AAAAGCAACA 


AGCAATTAAG 


7360 


/■i ■» is /*■■< rr*rrv> ■» ^ 

CAAGTTGAC C 


AAAATGTTCA 


AACTGCATTA 


GAAAGCATTA 


ATAATGGTGT 


GGATAATGGT 


7440 


GACGTTGATG 


ATGCATTAAC 


ACAAGGTAAA 


GCAGCAATTG 


ATGCTATTCA 


AGTAGATGCT 


7500 


ACTGTTAAAC 


CTAAAGCGAA 


CCAAGCTATT 


GAAGTTAAAG 


CAGAAGATAC 


GAAAGAATCT 


7560 


ATTGATCAAA 


GTGACCAGTT 


AACTGCTGAA 


GAAAAAACTG 


AAGCATTAGC 


AATGATTAAA 


7620 


CAAATTACAG 


ATCAAGCTAA 


ACAAGGTATT 


ACTGATGCAA 


CAACAACTGC 


TGAAGTTGAA 


7680 




cTCaAGGACT 


TGAAGCATTT 


GATAACATTC 


AAATCGACTC 


AACAGAAAAA 


774 0 


CAAAAAGCTA 


TCGAAGAATT 


AGAAACTGCA 


CTAGACCAGA 


TTGAAGCAGG 


TGTAAATGTC 


7800 


AACGCTGATG 


CTACAACTGA 


AGAAAAAGAA 


GCGTTTACGA 


ATGCTTTAGA 


AGACATTTTA 


7860 


TCAAAAGCAA 


CTGaAGATAT 


TTCTGATCAA 


ACTACAAATG 


CAGAAATCGC 


TACTGTCAAA 


7920 


AATAGTGCGC 


TTGAACAACT 


TAAAGCACAA 


CGTATTAATC 


CTGAAGTTAA 


GAAAAATGCT 


7980 


TTGGAAGCAA TCAGAGAAGT 


GGTTAACAAG 


CAAATAGGAA 


tAATTAAAAA 


TGCAGATGCA 


8040 


GATGCATCGG 


CGGAAAGAnA 


TTGCACGTAC 


GGGATTTAGG 


TAGATATTTT 


GGACCGATTT 


81C0 


GCTGGATAAA 


TTTAGGGTnA 


AACCCCAACC 


AATGCCGAAG 


TTGCCTGAAT 


TACCA 


8155 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1630 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 

so 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT 120 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 180 

AGTTATCGAA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 240 

TACAATCTGG ATTAATTAAT TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 300 

CCGTTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA TCATCTTTAT 360 

CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 420 

CACCTCCAAC GCCAAGTATG ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 480 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 540 

ATACATAAGT ACATAGTAAC TTAAAATTTT ATATTTAGCA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA CGCTATAAAA CCCAACTAAT TCTTTATTAA 660 

AAACTTAAAG AAACGCATAA AAATACGCAA GACAAAGTCT TGCGTATCGA TAGAGTCCGT 720 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTGTTAT ATACAGGTGG GTGCCCTGTT 780 

TCTTGTTTTG TACGTCCTTC ATATAAGGCG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 840 

25 CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 900 

ATCTTATGAT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 960 

TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA 1020 

30 GAAACTTCAC TAGGCGCATT CGTTAATAAA CATGTAGCAG ATGCTGTTTT AGGGAATGCG 1080 

ATTGTATCTC TCAAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC tAATCCTAAT 1140 

GCAATACCGC CATGTGGTGG TGCACCATAT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 1200 

TCCTgTGCTT GTTCTTTAGT AAATCCAAGA ACTTCGAACA TTTTTTCTTG TAACTCACCA 1260 

TCAT6AATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 1320 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AGCTTAGCAA TATCAGCTTC TTTTGGAGAT 13 80 

GTAAATGGAT GATGTGCTGC AACGTAACGT TTCGCATCTT CATCATATTC TAATAATGGC 144 0 

CAATCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA TTCTTTAGCT 1500 

AATTTGACAC GTAATGCACC TAAACTTTGT GCAACGACAT TTGGTttGTC TGCAACAAAC 1560 

ATTACTAAGT CACCAGCTTC AGCACCAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 1620 

CAAAGAAACG 1630 

so (2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 732 base pairs 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: €5: 



20 



25 



30 



CAATTGGACA 


TCTTGTATGA 


AAAGGACAAC 


CTTGCGGCGG 


ATTACTTGGC 


GAAGGTAATT 


60 


CTCCTTTTAA 


TATAATTCTA 


TTGTTATTAT 


GTTTATCAAT 


TTGTGGTATT 


GATGAAATCA 


120 


ACGCTTTTGT 


ATATGGATGT 


TTGGGATTTT 


CATAAATTTC 


TTTATCAGAT 


GCGATTTCAA 


180 


CTATATGACC 


TAAATACATA 


ACTCCAATGA 


CATCACTTAT 


ATGTTTTACT 


ACACTTAAAT 


240 


CATGTGCGAT 


AAATAAATAG 


CTTAAGTTAA ATTGTTCTTG 


TAAATCTTTT 


AATAAATTCA 


300 


GTACTTGAGA 


TTGAACAGAT 


ACATCTAATG 


CACTTACAGG 


CTCATCAGCA 


ACAATTAAAC 


360 


TCGGACGCAA 


AGCCAATGCT 


CTTGCAATTC 


CCACTCTTTG 


TCTCTGTCCA 


CCTGAAAATT 


420 


CATGTGCATA 


TTtATAATAT 


GCATCTTCAC 


TTAGGCCAAC 


ACATTTTAAT 


AAATATAGTA 


480 


CTTCTTTTTT 


TATTTCTTCT 


TTTGGCAATT 


TTTTATAATT 


TAAAATAGGT 


TCTGAAATGA 


54 0 


TATCTCCAAC 


CATTTGCATC 


GGATTCAATG ATGCATACGG 


ATCTTGAAAT 


ATCATCTGAT 


600 


ATTGTTGTCG 


TGATTTTCTG 


AGTTTTTTAC 


CTTGTAATCT 


TGTTATATCT 


TCACCATTAA 


660 


CAATTATTGA 


GCCTGAAGTT 


GCATCTTCAA 


GCCTGATAAT 


CACTTTACCT 


AACGTTGACT 


720 


TACCACAACC 


CG 










732 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 583 8 base pairs 
35 (b) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



45 



50 



AATATATTCA 


TATGTTTCAT 


CAACAATATT 


AGCTGCTTTT 


TGAATTAAAG 


CAATTTCGTC 


60 


AGCATCTTTG 


ACGTCTCTAA 


TTTTATCTAC 


AGTATTAGAA 


ATGCTTATTA 


ATGATATACG 


120 


GCTTTTATTT 


AATTCAAGGT 


ATGTATCATA 


ACTTACATGA 


TGCCCCTCAA 


AACCTACATT 


180 


TTCAAAATTT 


TCTTGGTGTA 


GCAATTCTTT 


AATCTCACCA 


ATAATAGTAG 


ATTTACGATT 


240 


AATAATTTCA 


TAATTTGGCG 


CCTGCTTAGT 


TGCTTGATCA 


ATATATCTAA 


AGTCTGTTAT 


300 


CAAATATTGT 


TTATCTTTAG 


ATATGATAAG 


TGCTCCACTG 


GTACCAGTAA 


AACCTGATAA 


360 


ATATCTTCTA 


TTGTAATCCG 


AAAGAATGaT 


AATCGCATCT 


AAATGTTTTT 


GTTCTAAAAT 


420 
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CAACTTTATA 


CATTAAAATA 


ATATCATAAT 


AAGGATAAAA 


AATAATAGAT 


ATTGATTTTA 


540 


GGGAGATAGT 


ft ft ft ft ft ft m » 

AATGAAAAAA 


TTGGTTTCAA 


TTGTTGGCGC 


AACATTATTG 


TTAGCTGGAT 


600 


GTGGATCACA 


ft ft k Mmm% 

AAATTTAGCA 


^^^^ ft mm ft m m *m 

CCATTAGAAG 


AnAAAACAAC 


AGATTTAAGA 


GAAGATAATC 


660 


ATCAACTCAA 


ft n ft ft m ft mm 

ACTAGATATT 


CAAGAACTTA 


ATCAACAAAT 


TAGTGATTCT 


AAATCTAAAA 


720 


TTAAAGGGCT 


TGAAAAGGAT 


AAAGAAAACA 


GTAAAAAAAC 


TGCATCTAAT 


AATACGAAAA 


780 


TTAAATTGAT 


GAATGTTACA 


TCAACATACT 


ACGACAAAGT 


TGCTAAAGCT 


TTGAAATCCT 


840 


ATAACGATAT 


7GAGAAAGAT 


GTAAGTAAAA 


ACAAAGGCGA 


TAAGAATGTT 


CAATCGAAAT 


900 


TAAATCAAAT 


TTCTAATGAT 


ATTCAAAGTG 


CTCACACTTC 


ATACAAAGAT 


GCTATCGATG 


960 


GTTTATCACT 


TAGTGATGAT 


GATAAAAAAA 


CGTCTAAAAA 


TATCGATAAA 


TTAAACTCTG 


1020 


ATTTGAATCA 


TGCATTTGAT 


GATATTAAAA ATGGCTATCA 


AAATAAAGAT 


AAAAAACAAC 


1080 


TTACAAAAGG 


ACAACAAGCG 


TTGTCAAAAT 


TAAACTTAAA 


TGCAAAATCA 


TGATAGGAGT 


1140 


CTTTTAATGC 


GTAATATAAT 


ATTTTATCTT 


GTACTTATTA 


TTGCTGCGAT 


TGGATTAGTA 


1200 


ATGAATCTAG 


ATGCCTTTAT 


TTTTTCAATC 


GTCAGAATGT 


TAATCAGCTT 


TGcgTAaTAG 


1260 


CTGGTATTAT 


TTATCTGATT 


TATTATTTCT 


TCATCTTAAC 


TGAAGACCAA 


CGCAAATATC 


1320 


GCAAAGCAAT GCgTrAaGTA TAAAAGAAAT CAAAGAAGAA AATAGATAAA 


AAAACGGAAG 


1380 


CACTTGTAGG 


TAAAATAGTC 


TACGTGCTTC 


CATTTTTTAT 


TCTAAAAACT 


ACTTTCTAAA 


1440 


CATCCATTCA 


TCTGAACGAT 


ATTTTTCAGT 


TAATTCTTCC 


ACTTCTGCCA 


ATTGAGCTTC 


1500 


TGtTAATTCA AGTGGCTTTA 


ATTCTATATT 


TAAACCTTTC 


TTAAAACCTT 


TCTCGAAAGC 


1560 


TTCTTCCATT TGACTAATAG TAATGTGTTC ATCTGAAATA TCATTGATGG 


CAACTGCTTT 


1620 


TTCAACGAAT 


GCCTCTTTCA 


TTTTTAATTT 


TAATCTTTCA 


TTTTTATAAA 


TrAACATATC 


1680 


AAACAGTTCA 


TCAATATCAA 


TATCTTGTAA 


AATCGAACCG 


TGTTGGAGGA 


TTACGCCCTT 


1740 


TTGTCTCGTT 


TGAGCACTCC 


CAGCAATCTT 


ACGGCCTTCA 


ACAACTAGCT 


CATACCAACT 


1800 


TGGTGCATCA 


AAACACACTG 


AACTTCGAGG 


TTGTTTTAAT 


TTTTGACGCT 


CTTCAGGCGT 


1860 


TTTAGGTACC 


GCAAAATAAG 


TATCAAATCC 


TAAGTTTTTA 


AATCCTTCTA 


ATAATCCTTG 


1920 


TGAAATCACT 


CTGTACGCTT 


CTGTAACTGT 


AGAAGGCATA 


TTCGGATGCG 


ATTCAGGCAC 


1980 


AATCACACTG 


TAAGTTAACT 


CTTTATCATG 


TAGCACCCCA 


CGGCCACCAG 


TTTGACGCCT 


2040 


TACGAGACCA 


AAACCTTTCT 


CTTTAACCTT 


ATCAATATCA 


ATTTCTTTTT 


GTAGCCTTTG 


2100 


GAAATACCCT 


ATTGATAATG 


TTGCAGGATT 


CCATGTGTAA 


AAACGTATAA 


CTGGATCAAT 


2160 


TTCACCTCTA 


GAGACAAAAT 


TTAATAACGC 


TTCATCCATT 


GCCATATTAT 


AATATGGGTC 


2220 
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AAATGTATAA TATTTGATTC GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 2340 

TCTTTAGTGA AATATTTTGA TAATTTGACC TAACAGTCTT ATAATTATAT TATCGTTTAA 2400 

TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 2460 

TATTGCTTAT ATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 2520 

AAATGAATTC CATAATGGGA TTAGAAAAGC TCAAGTCATC GATGTTAGAG AGAAAGTTGA 2580 

CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAATGT TCAGGCAACG 2640 

ATTCCAAGGA TTAAGAAAAG ATCAACCGGT ATACTTATGT GATGCCAATG GGATTGCTAG 2700 

CTATAGAGCC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG 2760 

CGGCTATAAA AAATGGACTG GAAAAATAAA GTCTAAAAAA TAGTTTTTGT AAATTTAATA 2820 

TACGATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 2880 

20 TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 2940 

CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAATG 3000 

AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 3060 

TTTCTTGTTT AAATAGAAAT TGTCTTTTTC AATTGATTTT GAAACCATTA TCCTTAAATC 3120 

TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 3180 

TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAGCTG TAGCTTCGTC TAATCGATCA 3240 

ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 3300 

ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 3360 

TCAATCATCA TACCTTCTTC AACATTTAAT GGGAAGTATA TTGTTGGTGG ATGTACACCG 3420 

AAATCTAATA ATCGCTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 34 80 

CCACTTAACA CAAACTCGTG TTTACAATAT TGTTTATAAG GTATTTCAAA GTGTTTAGAT 3540 

40 AAACGTGCTT TAATATAATT CGCATTAAGA ACCGCTGCTT CAGAAACCTC TTTAAGTCCA 3600 

GTTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTACCATAA 3660 

AATGGTTTTA CACGTCCGAT AGAATTTTTA ATGTCATTAT CATATTTAAA TTTGTCGCCA 3720 

45 TCTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC ACCGACTGGA 3780 

CCTGAACCAG GACCGCCACC ACCATGTGGA CCAGTAAATG TTTTATGCAA GTTTAAATGA 3840 

ACAGCATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 3 900 

50 

CCATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 3960 

TTTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 4020 
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GATTTAAATC CTGCAAATGa AGCTGAGGCT GGaTTCGTAC CATGCGCAGA ATCTGGcACA 4140 

ATGACTTCAT CACGATGACC TTCACCATTA TTCTCATGGT AAGCTTTAAA TATCATCAAT 4200 

GCAGTCCATT CACCATGTGC GCCAGCAGCT GGTTGTAATG TCACCTCATC CATACCAGTA 4260 

ATTTCTTTTA ATTCTTCTTG CAAACTATAA ATAATTTCTA ATGAACCTTG AACTTGATCT 4320 

TCATCTTGTA ATGGATGTGA TTCACTAAAT CCTGGTATTC TAGCAACCTT TTCATTAATT 4380 

TTAGGGTTAT ACTTCATCGT ACATGAACCC AATGGATAAA ATCCGTTGTC TACACCGAAA 4440 

TTTTTATTTG AAAGTTCAGT ATAATGACGT ACTAAGTCTA GTTCAGCAAC TTCAGGAAAC 4500 

TCCGCTTTGT TTTTACGAAT AAATTTATCA TCTAACAATG ACTCAACAGA ATTTGTTTTA 4560 

ATATCACTTT TTGGTAATGA ATATGCATAT CTGCCTTCAC GAGATCTTTC AAAAATTAAT 4 620 

GGACTTGATT TACTAGTCAT TTAACTCACC AGCCTTTTCT ACAAATGTAT CGATTTCATC 4 680 

TTTTGTTCTT AATTCAGTTA CAGCTATTAA CATGTGATTT TTAAAGTCGT CTGAAACAAC 4 740 

ACCTAAATCA AAACCACCGA TAATATTGTA CTTCACTAAT TCCTCGTTAA CTTGTTGAAT 4 800 

TGGTTTGTCA AATTTGACTA CAAACTCATT GmnAAGnTGT ACCATCTAAT ACTTCAAAAC 4 860 

25 CTTTTTTAAT AAATTGTTGT TTAGCATAGT TAGCATGTTC TATATTTTGA ACTGCAATAT 4 920 

CATAGATACC TTGTTTACCA AGTGCTGACA TTGCAATTGA TGaCGcTAAA GCATTTAATG 4 980 

CTTGGTTAGA ACAAATATTA GATGTCGCTT TATCGCGTCG AATATGTTGT TCACGTGCTT 5040 

GTAATGTTAA TACAAAGCCA CGATTACCTT CATCATCTTG TGTTTGACCG ACTAATCTAC 5100 

CTGGCACTTT ACGCATTAAC TTTTTCGTCG TTGCAAAATA TCCACAATGT GGCCCACCGA 5160 

ATTGAGCAGG AATTCCGAAT GGCTGAGTAT CACCTACAAC AATATCTGCA CCAAATGAAC 5220 

CTGGAGGTGT AAGTAATCCC AATGCTAATG GATTTGCATA TACGATAAAT AATGCTTTTT 5280 

TATCFTCAAT AAAGCTATGA ATCTTTTCAA GATCTTCAAT TGAACCGTAA AAGTTTGGAT 534 0 

ATTGTACTGC AACAGCTGCT GTTTCATCAT CCACTGCTGC TTCTAATTTT TTCAAATCTG 5400 

TAACAGTGCC ATCTAAATCG ATTTCCACTA CTTCGAATTC CTTACGCGTC TTAGCATAAG 54 6 0 

TATGAAGTAC TTGTAATGCT TGATAATGTA AACCTTTTGA GACTACAATT TTATTTTTCT 5520 

45 TTGTTTGACT AAATGCTAAG ATACATGCTT CAGCAAAGCT AGTCATCCCA T CAT A CAT AG 5580 

AAGAATTTGC TACATCCATA TCTGTTAATT CACAAATTAA AGTTTGGAAC TCAAAAATGG 564 0 

CTTGTAATTC ACCTTGAGAA ATTTCCGGTT GATATGGCGT ATATGCTGTG TAAAATTCTG 5700 

ATCTTGAAAT CATAGCATCC ACAACTGATG GCGCGTAATG ATCATAAACA CCAGCACCCA 5760 

rAAATGATGT ATGCGTTTCT TTAGTGATAT tCTTGCTkGC AATGGGGATT TAAACnTCTA 5820 
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(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18355 base pairs 
<B) TYPE : nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 





(xi) 


SEQUENCE DESCRIPTION: . 


SEQ ID NO: 


67: 








ATnATAATTG 


GCTTTGCTAA 


TAATTACTTC 


CCTGAATTAC 


aAGTATTAGC 


AAACGAAATA 


60 


15 


AAATCTGATA 


TGGCTAGTTC 


ATTAAAACAA 


TGATATTTTT 


ATTTAAATTT 


TTaAAGCTTT 


120 




GTACGAAATT 


GTACAAAGCT 


TTTTTGGTGC 


GTATTGTATG 


GGCAACAACT 


TGACGATGAA 


180 




AATCCGTTAC 


AGGATTGGTA ATAGGAAATG 


TTAGCGAAAG 


ACAAGGGTAT 


CCATTGTAGA 


240 


20 


TTAACAAAAG 


urW.O 1 1 1 V< 




TTATTCTCAC 


TAAAGCAATA 


CGCAGAGACA 


300 




ACTTACGTAA 


AATTTTfi A A r* 




GGAACTTCTA 


CTCAATTATT 


GATAAAAATT 


360 




TTCAAAAAGA 


CTTGAATGTG 


C*TCZ HCZ A AT a C 


GAAGTTTATG 


GAAGGATTAT 


CAAAATATAA 


420 


25 


ATGTGCATTC 


ATTTACAAP C 


T 1 *TT > A T*TT2 a P a 


ATGATTCTCA 


ACTAATATAG 


TATATAATCA 


480 




AATCGTAATA 


GTTACGATTT 


GTTTTCTGCA 


ACTTTTTTGA 


AGTTTTAGTT 


GAGGTGAAAA 


540 




CAATAAAAGC 


ATCTAAGTGA 


ATGTAGTTAA 


CGGACAACTG 


CATTCGCTTG 


TAGAGCCACA 


600 


30 


AGAAGCAACT 


TTAAATAAGG 


TTTACGGTTG 


CATTTTGATA 


CAACAACCGA 


TTACTAAGTC 


660 




ATGCTTTCCA 


CTTTGCGGGT 


TAGCATGACT 


TACCTAATAG 


ATAGAGCTAT 


TAGGTTCAGC 


720 


35 


TTCTAAAAAA 


TTACAGTTTT 


AGAGGAATAC 


AGTTGcTTGc 


tTCGCAACAA 


CTGCATAAGA 


780 


GCCATGGTTT 


TCGCTTTTGC 


GAATTAGCAT 


GACTTACCTA 


CTAGATAGAG 


CTATTAGGTT 


840 




CATCTTCTAA 


AAAATTACAG 


GTTTAGAGGA 


ATACAGTTGT 


TTGcTTCGCA 


ACAACTGCAT 


900 


40 


AAGAGCCTCT 


AGTAATTAAA 


ATTACAGAGG 


CTCTAAAAAT 


ACATCTAAAG 


GAGTGTCGTA 


960 




TGAATCGGCA 


GGTTATAGAA 


TTTTCTAAGT 


ATAATCCTTC 


GGGGAATATG 


ACGATACTTG 


1020 




TTCATTCAAA 


ACATGATGCT 


AGTGAATATG 


CATCTATCGC 


CAATCAGTTG 


ATGGCCGCAA 


1080 


45 


CACATGTATG 


CTGTGAACAG 


GTAGGCTTTA 


TAGrATCAAC 


ACAAAATGAT 


GATGGTAATG 


1140 




ATTTTCACTT 


AGTTATGAGC 


GGTAATGAAT 


TTTGCGGTAA 


TGCGACGATG 


TCATATATAC 


1200 




ATCATTTGCA 


GGAAAGTCAT 


TTGCTTAAAG 


ACCAACAGTT 


TAAGGTGAAG 


GTGTCTGGCT 


1260 


50 


GTTCGGATTT 


AGTGCAATGC 


GCAATTCATG 


ATTGCCAATA 


CTATGAAGTT 


CAAATGCCAC 


1320 




AAGCCCATCG 


TGTTGTGCCA 


ACAACAATTA 


ATATGGGTAA 


TCATTCATGG 


AAAGCAATAG 


1380 
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TTCAACATTT GGTTGAAGCG TTTGTGCGTG AgcAACAATG GAGTCACAAA TATAAAACAG 1500 

TAGGTATGAT GCTTTTTGAT GAACAACGTC AATTTTTACA GCCATTAATC TATATACCAG 1560 

AAATTCAAAG TTTAATTTGG GAAAATAGCT GTGGTTCTGG TACAgcATCA ATTGGGGTTT 1620 

TTAATAATTA TCAACGTAAT GACGCATGCA AAGATTTTAC AGTACATCAG CCAGGGGGCA 1680 

GTATTTTAGT GACATCAAAG CGATGTCATC AATTGGGATA TCAAACTTCA ATTAAAGGAC 1740 

AGGTTACAAC TGTAGCTACA GGaAAAGCAT ATATAGAATA AGGAGCCTAC AATGAATAAC 1800 

TTTAATAATG AAATCAAATT GATATTACAA CAATATTTAG AAAAGTTTGA AGCGCATTAC 1860 

GAGCGTGTAT TACAAGACGA TCAATATATC GAAGCATTAG AAACATTGAT GGATGACTAT 1920 

AGTGAATTTA TTTTAAATCC TATTTATGAA CAACAATTTA ATGCTTGGCG TGACGTTGAA 1980 

GAAAAAGCAC AATTaATAAA ATCACTGCAA TATATTACAG CGCAGTGTGT TAAACAAGTG 2040 

GAAGTCATTA GAGCGAGACG TCTATTAGAC GGACAGGCGT CTACCACAGG TTACTTTGAC 2100 

AATATAGAAC ATTGTATTGA TGAAGAGTTT GGACAATGTA GTATAGCTAG CAATGACAAA 2160 

TTATTGTTAG TTGGTTCAGG TGCATATCCA ATGACGTTAA TTCAAGTAGC AAAAGAAACA 2220 

25 GGTGCTTCAG TTATCGGTAT TGATATTGAT CCACAAGCCG TTGACCTAGG GCGCAGAATC 2280 

GTTAACGTCT TAGCACCAAA TGAAGATATA ACAATTACGG ATCAAAAGGT ATCTGAACTT 2340 

AAAGATATCA AAGATGTGAC GCATATCATA TTCAGCTCGA CAATTCCTTT AAAGTACAGC 24 00 

ATTTTAGAAG AATTATATGA TTTAACAAAT GAAAATGTCG TAGTTGCAAT GCGCTTTGGT 24 60 

GATGGCATCA AAGCAATATT TAATTATCCG TCACAAGAAA CAGCGGAAGA TAAGTGGCAA 2520 

TGTGTGAATA AACATATGAG ACCACAGCAA ATTTTTGATA TAGCACTTTA TAAAAAAGCA 2580 

GCTATAAAGG TAGGTATTAC GGATGTCTAA ATTATTAATG ATAGGCACTG GTCCgGTCGC 2640 

AATGCAATTA GCGAATATTT GCTATTTAAA ATCAGATTAT GAGATTGATA TGGTTGGACG 2700 

TGCCTCAACA TCAGAAAAAT CAAAACGCTT ATATCAAGCG TATAAAAAAG AGAAACAATT 2760 

TGAAGTCAAA ATACAAAACG AGGCGCATCA ACATCTGGAA GGTAAGTTTG AAATTAATCG 2820 

TTTGTATAAA GATGTTAAAA ACGTTAAGGG TGAATACGAA ACGGTTGTCA TGGCATGCAC 2880 

45 AGCAGATGCT TATTATGACA CACTACAGCA ATTGTCGTTA GAAACTTTGC AAAGTGTCAA 2940 

ACATGTCATT TTAATATCAC CGACATTTGG TTCGGAAATG ATTGTCGAAC AATTTATGTC 3000 

TAAATTTAAT AAAGATATCG AAGTGATTTC ATTCTCAACT TATCTTGGCG ATACACGTAT 3060 

50 TGTTGATAAA GAAGCGCCTA ATCATGTGTT GACAACAGGT GTAAAAAAGA AATTGTACAT 3120 

GGGATCGACA CATTCAAACT CAACAATGTG TCAACGAATC TCTGCTTTAG CTGAGCAATT 3180 
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10 



15 



TTATGTGCAC CCACCACTAT TTATGAATGA CTTTTCATTG AAAGCCATTT TCGAAGGAAC 3300 

AGATGTACCG GTTTATGTGT ATAAGTTATT TCCTGAAGGA CCGATAACGA TGACACTAAT 3360 

CCGTGAAATG CGTTTAATGT GGAAGGAAAT GATGGTTATT TTACAAGCAT TTAGAGTGCC 3420 

GTCAGTCAAC CTGCTTCAAT TTATGGTGAA GGAAAATTAT CCAGTACGTC CTGAAACTTT 34 80 

GGATGAAGGT GATATTGAGC ATTTCGAAAT CTTGCCAGAT ATCTTACAAG AATATCTGCT 3540 

TTATGTAAGA TATACCGCAA TCCTCATTGA TCCATTTTCA CAGCCAGACG AAAACGGACA 3600 

TTACTTTGAT TTTTCAGCTG TACCATTTAA GCAAGTCTAT AAAAATGAAC AGGATGTTGT 3660 

TCAAATTCCA AGAATGCCAA GTGAAGATTA TTACAGAACG GCGATGATTC AGCATATTGG 3720 

GAAAATGCTA GGTATCAAAA CGCCAATGAT TGATCAGTTC CTAACTCGCT ATGAAGCAAG 3780 

TTGCCAGGCG TACAAGGATA TGCATCAAGA TCAACACTTA TCTTCTCAAT TTAATACAAA 3840 

20 TCTATTTGAA GGAGATAAAG CACTCGTCAC AAAATTTTTG GAAATCAATA GAACGCTTTC 3900 

ATAATAAGGG TTTGAAGTTT TATAATAGAA AAAAATTATT GAATTATGTT TGACATTTAC 3 960 

ATAAAAATAA GCAAATAATT GAGAAAAATA ATCATTACGA TTTGATTAAG TAATGCAACT 4020 

25 TATCAATTTA GAAAGAGGAA AAGCAAATGA GAAAACTAAC TAAAATGAGT GCAATGTTAC 4080 

TTGCATCAGG GCTAATTTTA ACTGGTTGTG GCGGTAATAA AGGTTTAGAG GAGAAAAAAG 4140 

AAAACAAGCA ATTAACGTAT ACGACGGTTA AAGATATCGG TGATATGAAT CCGCATGTTT 4200 

ACGGTGGATC AATGTCTGCT GAAAGTATGA TATACGAGCC GCTTGTACGT AACACGAAAG 4260 

ATGGTATTAA GCCTTTACTA GCTAAAAAGT GGGATGTGTC TGAAGATGGG AAGACATACA 4320 

CGTTCCATTT GAGAGATGAC GTTAAATTCC ATGATGGTAC GCCATTTGca TGctGACGCA 43 80 

GTTAAGAAAA ATATTGACGC AgTTCAAGAA AACAAAAAAT TGCATTCTTG GTTAAAGATT 4440 

TCGACATTAA TTGACAATGT TAAAGTTAAA GATAAGTACA CGGTTGAATT GAATTTGAAA 4500 

40 GAAGCATATC AACCTGCATT GGCTGAATTA GCGATGCCTC GTCCATATGT ATTTGTGTCT 4560 

CCAAAAGACT TTaAAAACGG TACAAcAAAA GATGGCGTTA AAAAGTTCGA TGGTACTGGT 4620 

CCATTTAAAT TAGGTGAACA CAAAAAAGAT GAGTCTGCAG ACTTTAACAA AAATGATCAA 4680 

45 TACTGGGGCG AAAAGTCTAA ACTTAACAAA GTACAAGCAA AAGTAATGCC TGCTGGTGAA 4740 

ACAGCATTCC TATCAATGAA AAAAGGTGAA ACGAACTTTG CCTTCACAGA TGATAGAGGT 4800 

ACAGATAGCT TAGACAAAGA CTCTTTAAAA CAATTGAAAG ATACAGGTGA CTATCAAGTT 4 860 

50 

AAGCGTAGTC AACCTATGAA TACGAAAATG TTAGTTGTCA ATTCTGGTAA AAAAGATAAC 4920 

GCTGTGAGTG ACAAAACAGT CAGACAAGCG ATTGGTCATA TGGTAAACAG AGATAAAATT 4980 
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ACAGACATTA ATTTCGATAT GCCAACACGT AAGTATGACC TTAAAAAAGC AGAATCATTA 5100 

TTAGATGAAG CTGGTTGGAA GAAAGGTAAA GACAGCGATG TTCGTCAAAA AGATGGTAAA 5160 

AACCTTGAAA TGGCAATGTA CTATGACAAA GGTTCTTCAA GTCAAAAAGA ACAAGCAGAA 5220 

TACTTACAAG CAGAATTTAA GAAAATGGGT ATTAAGTTAA ACATCAATGG CGAAACATCA 5280 

GATAAAATTG CTGAACGTCG TACTTCTGGT GATTATGACT TAATGTTCAA CCAAACTTGG 5340 

GGATTATTGT ACGATCCACA AAGTACTATT GCAGCATTTA AAGAGAAAAA TGGTTATGAA 54 00 

AGTGCAACAT CAGGCATTGA GAACAAAGAT AAAATATACA ACAGCATTGA TGACGCATTT 5460 

AAAATCCAAA ACGGTAAAGA GCGTTCAGAC GCTTATAAAA ACATTTTGAA ACAAATTGAT 5520 

GATGAAGGTA TCTTTATCCC TATTTCACAC GGTAGTATGA CAGTTGTTGC ACCaAAAGAT 5580 

TTAGAAAAAG TATCATTCAC ACAATCACAG TATGAATTAC CATTCAATGA AATGCAGTAT 564 0 

AAATAAAGGA GCAATTAGAT GTTCAAATTT ATCTTAAAAC GTATTGCGCT CATGTTTCCA 5700 

TTGATGATTG TAGTAAGTTT TATGACATTT CTATTGACGT ATATTACAAA TGAAAATCCA 5760 

GCTGTGACAA TTTTACATGC ACAAGGGACG CCAAATGTAA CACCAGAGTT GATTGCAGAA 5820 

25 ACGAATGAGA AGTACGGTTT CAATGATCCA TTATTAATTC AATATAAAAA TTGGTTACTT 5880 

GAAGCGATGC AATTTAATTT TGGTACAAGC TACATTACAG GTGACCCAGT TGCTGAACGT 594 0 

ATTGGTCCAG CATTTATGAA TACATTGAAA TTAACAATAA TTTCAAGTGT TATGGTGATG 6000 

ATTACATCAA TTATTTTAGG TGTAGTTAGT GCATTAAAAA GAGGAAAGTT CACTGATCGT 6 060 

GCGATACGTT CAGTGGCTTT CTTTCTAACT GCATTACCAT CATATTGGAT AGCTTCAATA 6120 

CTTATTATTT ACGTTTCAGT GAAGTTAAAC ATATTGCCGA CTTCTGGATT AACAGGTCCA 6180 

GAAAGTTACA TATTGCCAGT GATCGTTATT ACGATTGCCT ATGCTGGTAT TTACTTTAGA 6240 

AATGTTAGAC GCTCGATGGT GGAACAATTA AATGAAGATT ATGTACTTTA TTTAAGAGCA 6300 

AGCGGTGTGA AATCTATCAC ATTAATGTTG CATGTGTTGC GTAATGCTTT ACAAGTTGCG 6360 

GTATCAATCT TTTGTATGTC TATACCAATG ATAATGGGTG GACTAGTTGT TATCGAGTAT 64 20 

ATCTTTGCAT GGCCTGGACT AGGTCAATTA AGTTTAAAAG CAATACTTGA ACACGATTTT 64 80 

4$ CCAGTCATTC AAGCATATGT ATTAATTGTA GCGGTATTAT TTATTGTATT TAATACATTA 6540 

GCAGATATCA TTAATGCGCT ATTAAATCCA AGATTAAGGG aGGGCGCACG ATGATAATTT 6 600 

TAAAmCGATT ATTmCArGwT AAAGGTGCAG TAATTGCTTT AGGCATTATT GTATTATATG 66 60 

50 TCTTTTTAGG ATTAGCAGCA CCACTTGTGA CATTTTATGA TCCTAACCAT ATCGATACAG 6720 

CAAACAAATT TGCTGGCATG AGTTTTCAAC ATCTACTAGG TACTGACCAT TTAGGTAGAG 6780 
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TATTTGTTTC TGTACTTATT GGATCTATTT 
TTGTTGACGC CTTAATCATG CGTGCGTGTG 
5 TAACGTTAGC ATTAATTGCA TTGTTTGGAA 

TTTTGACGCG TTGGGCATGG TTCTGTCGTG 
CTTCTGACCA TGTAAGATTT GCTAAAACAA 

10 

AACATATTAT GCCATTAACA TTAGCAGATA 
CAATGATCTT GCAAATATCT GGCTTTTCAT 
CAGAGTGGGG CATGATGCTT AACGAaGCTA 

15 

TGTTTGCGCC AGGTATTGCC ATAGTGATTA 
CTTTACAAAT TGCTATTGAT CCCCGCATCT 
AAGGAGTGGT GCAATCATGA CATTGTTAAC 

20 

GACAGATCAA CCACTCGTGA GTGATGTGAA 
CGTTATTGGA GAAAGTGGTA GTGGTAAATC 

25 TCCCGAACGA CTCGGGGTGA CAGGTGAAAT 

ATCTGAATCG CAATTGAAAA AGTACCGTGG 
TAGTCGTGCC TTTGACCCAT CAACTACTGT 

30 ACATACGTCA ATGTCTACAC AAGAAATTGA 

AAGTTTGAAA GATCCTAAAC G TAT ATT AAA 
GTTACAGCGA TTGATGATTG CTTTAGCGTT 

35 TGAGCCGACA ACGGCTTTAG ATACAATTAC 

TATX?^AAAAA CACTTTGACT GTGCGATGAT 
CAAGATTGCA GACCGTGTTG TTGTGATGAA 

40 

TGAATCAGTC TTGCATCATC CAGAACATGT 
GAAGATTAAT GATCATTTTA AACATGTGAT 
ATGTTGAAAA GTCATATCAA AGCGCACATG 

45 

AAGGTGTGTC ATTTGAGTGT CCAATCGGTG 
GCGGTAAATC GACGTTGAGT CktATGATAT 
so TAACCTTAAA TGATCAACCG ATGCATAAGA 

TATTTCAAGA TTATACGTCA TCATTACATC 



TAGGATTCTT ATCAGGATAT TTCCAAGGGT 6 900 

ATGTTATGTT GGCATTCCCA AGTTATGTTG 6960 

TGGGTGCCGA AAATATTATC ATGGCATTTA 7020 

TTATACGTAC AAGTGTTATG CAGTACACTG 7080 

TCGGTATGAA TGATATGAAA ATTATTCACA 7140 

TTGCTATCAT CTCTAGTAGC TCGATGTGTT 7200 

TTTTAGGATT AGGTGTCAAA GCGCCTACTG 7260 

GAAAAGTGAT GTTTACACAT CCTGAAATGA 7320 

TAGTGATGGC ATTTAACTTC TTATCCGATG 7380 

CTTCTAAAGA TAAACTTCGT TCTGTGAAAA 7440 

AGTTAAACAT TTGACGATTA CAGATACCTG 7500 

TTTTACATTA ACTAAGGGTG AAaCTTTAGG 7560 

AATCACTTGT AAATCGATTA TTGGTTTGAA 7620 

TATCTTTGAT GG TACAtCAA TGTTGTCATT 7680 

TAAAGACATT GCGATGGTCA TGCAACAAGG 7740 

CGGTAAACAA ATGTTTGAGA CTATGAAAGT 7800 

AAAGACATTG ATTGAATATA TGGATTATTT 7860 

ATCATACCCT TACATGTTAT CAGGAGGAAT 7 920 

AgcTTTgAAA CCAAAGTTAA TCATTGCTGA 7980 

ACAATATGAT GTACTGGAAG CATTTATAGA 804 0 

TTTCATTTCA CATGATTTAA CGGTTATTAA 8100 

AAATGGTCAG CTTATTGAAC AAGGGACACG 8160 

TTATACGArt ATTkTATTAT CAACGAAGAA 8220 

GAGGGGTGAT GTACATGATT AAAATTAAAG 8280 

TTTTTAAGCG TCGTCGAACA CCTATCGTGA 834 0 

CGACGATTGC GATTATCGGA GAAAGTGGTA 8400 >^ 

TAGGTATTGA GAAACCGGAT AAAGGTTGTG 8460 

AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 8 520 

CATTTCAGAC TGTTAGAGAA ATCTTATTTG 8580 
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TGTTGGAAGA 


AGTCGGTCTA 


TCTAAGGCAT 


ACATGGATAA 


ATATCCTAAT 


ATGTTATCAG 


8700 




GTGGAGAGGC 


GCAACGTGTT 


GCGATTGCGC 


GTGCAATATG 


TATTAACCCT 


AAATATATTT 


8760 


5 


TGTTTGATGA 


AGCCATTAGT 


TCACTCGACA 


TGTCAATTCA 


AACACAAATA 


TTAGATTTAT 


8820 




TGATTCATTT 


ACGTGAAACG 


CGTCAGTTGA 


GTTATATTTT 


TATCACACAT 


GATATTCAAG 


8880 


10 


CTGCCACGTA 


TTTATGTGAT 


CAATTAATTA 


TTTTTAAAAA 


CGGAAAAATA 


GAAGAACAAA 


8940 


TTCCGACAAG 


CGCATTGCAT 


AAAAGTGACA 


ATGCTTATAC 


AAGAGAATTA 


ATAGAAAAAC 


9000 




AACTATCATT 


CTAAGGAGTG 


AGATAATGAA 


AGGTGCAATG 


GCTTGGCCCT 


TTTTGAGATT 


9060 


15 


ATATATATTA 


ACATTGATGT 


TCTTTAGTGC 


CAATGCAATC 


TTAAACGTGT 


TTATACCTTT 


9120 




ACGAGGGCAT 


GATTTAGGCG 


CAACGAATAC 


GGTTATCGGT 


ATCGTTATGG 


GGGCATACAT 


9180 




GTTAACAGCA 


ATGGTATTTC 


GACCATGGGC 


AGGACAAATT 


ATTGCTCGTG 


TCGGTCCCAT 


9240 


20 


TAAAGTATTA 


AGAATTATTT 


TGATTATCAA 


TGCCATAGCT 


TTAATTATTT 


ATGGTTTTAC 


9300 




TGGCTTAGAA 


GGTTATTTCG 


TAGCACGTGT 


TATGCAAGGT 


GTGTGTACGG 


CATTCTTTTC 


9360 




TATGTCTTTA 


CAGCTAGGTA 


TTATTGATGC 


ATTACCAGAG 


GAACATCGTT 


CTGAAGGTGT 


9420 


25 


ATCATTGTAC 


TCGCTATTTT 


CAACGATTCC 


AAACTTAATC 


GGACCATTAG 


TTGCCGTAGG 


9480 




TATTTGGAAT 


GCAAATAATA 


TTTCACTATT 


TGCAATTGTC 


ATTATCTTTA 


TCGCATTAAC 


9540 




AACAACATTC 


TTTGsTATCG 


CGTGACCTTT 


GCTGAACAGG 


AACCCGATAC 


GTCAGATAAG 


9600 


30 


ATTGAAAAAA 


TGCCGTTTAA 


CGCTGTAACT 


GTTTTTGCGC 


AATTTTTCAA 


AAATAAAGAG 


9660 




TTGTTGAACA 


GTGGTATTAT 


CATGATTGTT 


GCATCGATTG 


TATTTGGTGC 


AGTTAGTACA 


9720 


35 


TTTGTACCGT 


TATACACAGT 


GAGTTTAGGA 


TTCGCGAATG 


CGGGAATCTT 


TTTGACAATA 


9780 


CAGGCCATCG 


CAGTTGTTGC 


GGCAAGATTT 


TACTTAAGGA 


AATACATTCC 


GTCAGATGGT 


9840 




ATGTGGCATC 


CTAAATATAT 


GGTATCTGTA 


CTATCATTAT 


TAGTAATCGC 


GTCATTTGTA 


9900 


40 


GTGGCATTTG GTCCGCAAGT AGGTGCAATT ATTTTCTATG GTAGTGCGAT ATTAATAGGA 


9960 




ATGACGCAAG 


CAATGGTGTA 


CCCAACATTA 


ACATCATACT 


TAAGCTTCGT 


CTTACCAAAA 


10020 




GTAGGTCGTA 


ATATGTTGTT 


AGGTTTATTT 


ATTGCCTGTG 


CAGACTTAGG 


TATATCGTTA 


10080 


45 


GGTGGCGCAT 


TGATGGGACC 


TATTTCCGAT 


TTAGTAGGAT 


TTAAATGGAT 


GTATCTAATT 


10140 




TGTGGTATGT 


TAGTCATTGT 


AATAATGATT 


ATGAGTTTCT 


TGAAAAAGCC 


AACACCACGT 


10200 




CCAGCGAGTA 


GTCTTTAATG 


AAGTGAATTA 


AAGCATATTA 


AGTTAATGAA 


TATTTAAATT 


10260 


50 


TTAAAAGGTA 


TATTGaGCAT 


GGCGATTCAT 


GTGCTTCATG 


CTAGGACATG 


AAACATTCTA 


10320 




TATGGCTCGT 


TTTTAGAACG 


ACAtATATCT 


AAATAAAGCA 


CGCTTArAAG 


TGAGTTTTGA 


10380 
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TTACATGAAA 


ATATGCAAAA 


CGAGTATAAC 


TGCTAATTGA 


TAGAAATAGC 


TCACCATAAA 


10500 




ATTACGGTAT 


GATTTTAAAT 


ATAAGTAAGT 


CGCACTACCT 


GCTAGTATCA 


ATGCTGGAAT 


10560 


5 


GAATTCCCAC 


CATGTATTAA 


TGTATGGATA 


GTAGAACAGA 


GTTTCAAGGA 


TAATGGACAA 


10620 




TACTATTGTA ATCTTTAAAG 


GTATTAATCT 


GCTTAATTCT 


TGAATTAAAA 


TATGACGGAA 


10680 


10 


AATAAGTTGA 


CAAATCAAAG 


TATTTAATAT 


AATGGTTAAC 


GAAAATATAG 


CTATTAAACT 


10740 


GATGGAaCCA TACCCTTTAA TGAGCGGGTA AATGTCAAAG ACAGTAAAGG AATCTACATT 


10800 




TAGTGCGAAA 


ATATTGAAAT 


GATTTAAAAG 


TAAAAAGAGT 


ACGACACTTA 


GTGTAAATGA 


10860 


15 


TATAAGAATA 


TGCCATTTAT 


ATTTAGCACT 


AGCAACGATT 


TGCGAACGTA 


TCATTGGAAT 


10920 




AAACGCATCT 


TCATGCATCA 


GACGAAAAAT 


AGCTAGTGAA 


ATAATAACTG 


CGAGTAAATA 


1098O 




GCTAATGTTC 


ATTGAAATAG 


GAAAAGAGAA 


ACCCCACGGA 


GCTTGTTGAG 


TGAATACAGC 


11040 


20 


TACTAACCCA 


AAAGTTAAAA 


AGACGATAAT 


GATCGGCAAG 


ATGTTAACCA 


AAAATATGTA 


11100 




AAGGAAAATA 


AATCCAATAT 


CACGTTTGAA 


AAAACGCGAT 


TGTTCGGTAG 


CGTATTCTTC 


11160 




TTCTATGTAA 


TGTTTATTTG 


TATTTGACAT 


AGTATACCTC 


TTAAATAGTT 


GTATTATATA 


11220 


25 


GATACTTTAG 


CACATATTAC 


TTTGTATTGT 


ATGTTTTATA 


CATTAAAATT 


TAAAATGAAA 


11280 




AACATATCAT 


AAAATTGTTT 


TATAAAATGA 


AGCGCTTCCA 


TTGTGTTTTG 


TTTTGTAAGG 


11340 




TGTATCATAA 


ATATTGAATT 


GAAATTTTGG 


GGGGAGGTAT 


TGTAATGACG 


TTTCTTACAG 


11400 


30 


TCATGCAATT 


TATAGTTAAC 


ATTATCGTTG 


TAGGATTCAT 


GCTTACGGTT 


ATTGTTATCG 


11460 




GGCTTATTTG 


GTTAATTAAA 


GATAAAAGAC 


AATCACAACA 


TAGTGTATTA 


AGGAATTATC 


11520 


35 


CTTTACTAGC 


ACGTATTAGA 


TATATTTCAG 


AAAAAATGGG 


ACCGGAATTA 


CGTCAGTATT 


11580 


TATTTTCTGG 


GGATAATGAA 


GGGAAACCTT 


TTTCACGTAA 


TGATTATAAA 


AATATCGTTT 


11640 




TGGC&GAAA ATATAACTCT 


CGTATGACCA 


GCTTCGGTAC 


TACTAAAGAT 


TATCAAGACG 


11700 


40 


GCTTTTACAT 


ACAGAACACA 


ATGTTTCCGA 


TGCAACGTAA 


TGAGATTTCA 


GTAGATAATA 


11760 




CAACATTGTT 


ATCAACATTC 


ATTTATAAAA 


TCGCGAATGA 


GCGTTTATTT 


AGTCGTGAAG 


11820 




AATATCGTGT 


GCCGACAAAG 


ATTGATCCGT 


ATTACTTAAG 


TGATGACCAT 


GCAATAAAAT 


11880 


45 


TAGGTGAACA 


TTTAAAACAT 


CCATTTATTT 


TAAAACGTAT 


CGTAGGACAA 


TCTGGTATGA 


11940 




GTTATGGCGC 


TTTAGGAAAA 


AATGCCATTA 


CAGCTTTATC 


TAAAGGTCTA 


GCTAAAGCGG 


12000 




GCACTTGGAT 


GAATACAGGT 


GAAGGTGGCT 


TATCAGAATA 


TCATTTAAAA GGTAATGGGG 


12060 


50 


ATATCATTTT 


CCAAATTGGT 


CCCGGTTTAT 


TTGGTGTTCG 


TGATAAAGAA 


GGTAATTTTA 


12120 




GTGAAGGTTT 


ATTTAAAGAG 


GTTGCACAGT 


TATCTAACGT ACGCGCATTT 


GAGCTGAAGT 


12180 
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TTGCTAAAAT 


CCGAAATGTT GAACCTTATA AAACAATCAA 


TTCACCTAAC 


CGTTACGAAT 


12300 




TTATTCATAA 


TGCTGAAGAT 


TTGATTCGTT 


TCGTCGATCA 


GTTGCAGCAA 


TTAGGTCAAA 


12360 


5 


AACCAGTAGG 


ATTCAAAATT 


GTAGTAAGCA 


AAGTTTCAGA 


AATTGAAACA 


CTTGTACGTA 


12420 




CGATGGTGGA 


ACTAGATAAG 


TATCCAAGCT 


TTATTACGAT 


TGATGGTGGT 


GAAGGTGGTA 


12480 


'10 


CTGGTGCAAC 


ATTCCAAGAA 


TTACAAGATG 


GTGTTGGCTT 


ACCGCTATTT 


ACAGCTCTAC 


12540 


CTATTGTGTC 


TGGCATGTTA 


GAAAAATATG 


GTATTCGAGA 


TAAAGTGAAA 


TTGGCGGCAT 


12600 




CTGGTAAGTT AGTGACACCA GATAAAATTG 


CGATTGCACT 


AGGTTTAGGT 


GCAGATTTTG 


12660 


15 


TAAATATCGC 


ACGTGGGATG 


ATGATTAGTG 


TCGGTTGTAT 


AATGAGTCAA 


CAATGTCACA 


12720 




TGAATACGTG 


TCCTGTAGGT 


GTTGCAACGA 


CAGATGCGAA 


GAAAGAAAAA 


GCATTGATTG 


12780 




TTGGAGAAAA 


GCAATATCGT 


GTCACAAACT 


ATGTAACAAG 


TTTGCATGAA 


GGCTTATTCA 


12840 


20 


ATATTGCAGC 


AGCTGTTGGC 


GTATCCAGTC 


CTACAGAAAT 


TACTGCTGAT 


CATATTGTAT 


12900 




ATCGAAAAGT 


CGATGGTGAG 


TTACAAACGA 


TACATGATTA 


TAAATTAAAA 


CTCATTAGTT 


12960 




AACTTAATTA 


TTTCGGGAAA 


TTGAAAGCAG 


CGGATTTTAG 


CGTTACTGCA 


AATAATTTTA 


13020 


25 


TATTAGTAGT 


GGATGCTGGT 


CACACAAGAA 


CTTCAAATAT 


TAAAGCCCTC 


AGAATATGAA 


13080 




TTAAGGTTTG 


TAACCTTAGT 


CTTATCTGAG 


GGCATTTTTA 


AGTTATAAAC 


TATTTGTCGT 


13140 




CCATTTTATC 


TTTTTCTTTT 


AAACCTCTGT 


GCTTTAATTG 


CTTTTCAAGT 


TTTTCAAAAC 


13200 


30 


TAATATCTTT 


ATTTTCTTTA GTCGAAACAC 


CAAGACGTTT 


ATTTAATTTT 


TTCATGTCAA 


13260 




CTTCTGTGTA 


ATCTATGTCT AAGTGyTCAA TTGCTTTTTT ATCTTTATAG TCTACTTTGT 


13320 


35 


ATTTTACGCC 


TTTAAGGTCT 


TTGAAAATAC 


TTTCAGATTT 


GGCGAATAAC 


TTTTTGGCTT 


13380 


CGTCTTTATC 


CATACCTAGA 


TCGTCATATT 


TAATTGTGTT 


GATTGTAGAC 


TGTTTTAAAA 


13440 




CTTfATCATC TTTATATGTG 


ATAGAAGTTA 


GTACATGTTT 


ACCACTAACA 


TCACCwTCAT 


13500 


40 


ATGTTTTGGT 


TTGTTCTTTA 


CCACAAGCTG 


ATAATGCAAT 


GATACAAACT 


AATGCTACTA 


13560 




CAATTAATGA 


ACATAATTTT 


TTCAAAGTCA 


GTCGCCTTCT 


TTCGATATTT 


GTATTATAAA 


13620 




GAAATTATAA 


CATTTACTAA 


AAAATGATGT 


TATTCAAAAA 


TTTAAATTTT 


GTCATTTTTT 


13680 


45 


TTGAAGATAT 


GAGTTTTTTT 


AAGCGGATTC 


CTCACAAAAT 


TTTAAAAATA 


TTTAAGCCTk 


13740 




AAAATGATAA 


AGCGkTAGGG 


AACGTTTTTC 


TGAAAGTTAG 


TGATACAATA 


GTTTTAAGTT 


13800 




GAAATACAGG 


AGGATGAATA 


ACATGAATCA 


GTCAGTCAAA 


TTACTTAAAC 


ATTTAACAGA 


13860 


50 


TGTAAACGGC 


ATTGCTGGTT 


ATGAAATGCA 


AGTTAAAGAA 


GCAATGCGTa 


ACTATATAGA 


13920 




GCCTGTCAGT 


GATCAAATTA 


TTGAAGATAA 


CTTGGGTGGC 


ATTTTTGGAA 


AGAAAAATGC 


13980 
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AACAAAGATT 


GATAAACATG 


GTTTTATTTC ATTTACGCCA kTgGTGGATG GTGGAATCAA 


14100 




GTCATGCTAT 


CTCAAAAAGT 


AACGATTACA 


ACAGATTCGG 


GCAAAGAAAT 


TAGAGGTATC 


14160 


5 


ATCGGTTCTA 


AACCGCCACA 


TGTCTTAACG 


CCTGAAGAAC 


GTAAAAAGCC 


AATGGAAATC 


14 220 




AAAAATATGT 


TTATAGATAT 


TGGTGTTAGT 


AGCAAGGAAG 


AAGCTGAAGA 


AGCTGGCGTT 


14280 


10 


GAAGTAGGCA 


ATATGGTTAC 


GCCATATAGT 


GAATTTGAAG 


TGCTTGCAAA 


TGATAAATAT 


14340 


TTAACTGCGA 


ArCATTTGAT 


AATCGCTATG 


GCTGTGCATT 


AGCTATTGAG 


GTATTAAAAC 


14400 




GTTTAAAAGA 


TGAAAATATT 


GGCATTAACT 


TATACAGTGG 


TGCCACAGTG 


CAAGAAGAAG 


14460 


15 


TTGGTTTGCG 


TGGTGCGAAA 


GTGGCAGCGA ATACGATTAA 


ACCAGACTTG GCGATAgcTG 


14520 




TcGATGTAGG 


TATTGCTTAT 


GATACCCCAG 


GTATGTCAGG 


TCAAACGAGC 


GATAGTAAAC 


145B0 




TAGGCGGTGG 


TCCAGTTGTC 


ATTATGATGG 


ATGCTACAAG 


TATTGCTCAC 


CAAGGTTTGC 


14640 


20 


GAAAgcATaT TAAAGATGTA GCTAAGGAAC ATAACATCGA AGTACAATGG GATACGACAC 


14700 




CAGGTGGAGG 


TACAGATGCG 


GGAAGTATTC 


ATGTCGCAAA 


TGAAGGTATT 


CCAACGATGA 


14760 




CAATCGGTGT 


TACGCTGCGA 


TACATGCATT 


CTAATGTTTC 


AGTGCTCAAT 


GTAGATGATT 


14820 


25 


ATGAAAATTC 


TATCCGTCTT 


GTTACTGAAA 


TTGTCCGTTC 


ATTGAATGAT 


GAAAGTTATA 


14880 




AAAATATCAT 


GTGGTAATCA 


AATCCATAAA 


TAATAAAGAA 


TCCTTTTAAT 


ATGGTAGGTT 


14940 




GTTAAACAAT 


TGTCTAATTT 


TAATTCTTAG 


TCATTAGACA 


GTATCCATGT 


TAATAGGATT 


15000 


30 


TTTTGTTTTT 


AATTTAAATG 


CTGAAAATCA 


ATTATGCCTA 


AATTTTGATA 


TTACAAGAAA 


15060 




ATGATTTTTT 


CTTAAATGTA 


ATTGCACTAA 


AAACCAAAAA 


AACGGGAATA 


ATATACCTGA 


15120 


35 


TATATTACAT 


GAGGAGCGGT 


GCAAATGTTG 


TTAGAAATTA 


AAGATTTAGT 


GTATAAAGCG 


15180 


AGCGATAGAA 


TCATACTAGA 


TCATATCAGT 


CTAAAAGTAG 


ATAAAGGCGA 


GAGTATTGCC 


15240 




ATTATAGGTC 


CATCAGGTAG 


TGGTAAAAGT 


ACATTTCAAA 


AGCAAATATG 


TAATTTGTTT 


15300 


40 


AGTCCAACTA 


GTGGAGAACT 


TTATTTTAAA 


GGTAAACCCT 


ATAATGATTA 


TGACCCGGAA 


15360 




GAATTGCGTC 


AACGAATCAG 


TTATTTGATG 


CAGCAAAGTG 


ACTTGTTTGG 


TGAAACGATT 


15420 




GAAGATAACA 


TGATATTCCC 


ATCACTTGCA 


CGTAATGATA 


AATTTGATAG 


AAAACGTGCA 


15480 


45 


AAGCAATTAA TTAAAGATGT 


CGGTTTGGGA 


CATTATCAAT 


TAAGTTCGGA 


AGTGGAAAAT 


15540 




ATGTCGGGTG 


GTGAGCGGCA 


AAGAATTGCT 


ATAGCGCGCC 


AACTGATGTA 


TACACCGGAT 


15600 




ATTCTTTTAT 


TAGATGAATC 


GACCAGTGCA 


TTAGACGTTA 


ATAATAAAGA 


AAAGATAGAA 


15660 


50 


AATATCATTT 


TTAAATTAGC 


AGATCAAGGC 


GTGGCAATTA 


TGTGGATTAC 


CCACAGCGAT 


15720 




GACCAAAGTA 


TGCGACACTT 


TCAAAAGCGT 


ATAACAATTG 


TTGATGGTCA 


AATTTCTAAT 


15780 
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10 



15 



20 



25 



30 



35 



40 



45 



SO 



CATTCCGATT ATCATTTCAT 
GACATTACGA GCAGTTGTGC 
AATAAACGAT AAATGGCTGC 
GAATACAATT AGTCGAGCAT 
TATCTTCATT GGAACGGCAT 
TACCGCAAAT GAAGTTATAC 
TAATTTAGCT TACCAGAATT 
TAAATTATCA CTTGCAGCTA 
TCGTTTAGCT ATAGTGCCAA 
TGGTATGATG ACAGGCTTAA 
ATTGTTAGTC GTGTTTATTC 
TTTAAGCTAT GGTCAATTTT 
TAAGAGTGAA TCATGATAGA 
TGAAGTGACG CGCACATATA 
CACGCATACT TTGTAGTCAG 
GTTATTGTTG AAACTGAAGT 
TTCAAATACT GAAAAACAAT 
ATATTATATA ATAGAACGGT 
TATTCTAAGG GAAATGAATA 
TGCGACTATT TACATTCATT 
CAGGAATTAT GAACTTGATG 
TAGTGACAAT GTACGCTTTA 
ACCGTTTTTC ATCAAGGCCT 
GCATTATTGC TGTAGCGCCA 
CAGCAGCAGC ACTAATTATC 
AAAATCGTGG TAAAATGATT 
GTGTACCAAT TGGAACGGTT 
TAATTATTGT GAGTATTATT 
AAATACAACG AGGCCCTGTG . 



ATAAAGAAGG TTTACATATT ATTAAAGATT TAATTGTTGC 15900 

AATTAATCAT TTTGGGATTT TTGCTGCATT ATATTTTTAA 15960 

TTATTTTATG TGTATTGGTC ATTATTATTA ATGCATCATG 16020 

CACCAGTGAT GCATCATGTG TTTTGGATAT CATTTCTAGC 160 BO 

TACCGCTTGC AGGTACTATT GCGACAGGGG CCATTCAATT 16140 

CTATCGGCGG CATGCTTGCA AATAATGGCT TGATTGCAAT 16200 

TAGATCGTGC ATTCGTACAA GATGGTACTA ATATTGAATC 16260 

CACCTAAATT GGCTTCTAAA GGTGCAATAC GTGAAAGTAT 16320 

CTATTGATTC GGTTAAAACA TATGGGCTTG TGTCGATTCC 16380 

TTATTGGTGG CGTACCACCT TTACAAGCGA TTAAATTTCA 16440 

ATACAACTGC GACCATTATG TCTGCTTTGA TTGCGACATA 16500 

TCAATGCAAG ACATCAATTA GTAGCACGAA ATACTGATGT 16560 

TTTTACTGCA TCAGATTTAG GCATTAGTTT TAATTGGAAA 16620 

GTATCGCTAT TCATTAGCGC AGCGAAAATA TTCATAAAGG 16680 

TTATCTGTTC TGACATATAA AGCGTGCGTG CTTTTTTGGA 16740 

AATTATACAT AATTATTAAA TGACATACTT GTGTTAATTT 168 00 

TTCaATAATT TTCCaATTAA GCACAGAAAA TTAAAGCAAA 16860 

TATATATaAA nATTngTgCA CACATTTTTT AATAAATCGT 16920 

TCGGAAATTT TGTTTGAAAG GAGTTTTAAA TTGTCAATCA 16980 

TTAAGTATTT TTATCGTAGG AATGGTTGAA ATGATGGTTG 17040 

AGTCAGGACT TACATGTATC AGAAGCTGTC GTTGGTCAAT 17100 

ACATTTGCGA TATGTGGACC TATTCTGGTT AAATTAACGA 17160 

GTATTATTAT GGACATTACT TATATTTATC ATTGGTAATG 17220 

AATTTTTCaA TATTAGTAGT TGGTAGAATT ATCTCATCTG 17280 

GTAAAAGTAT TAGCTATTAC AGCGATGTTA TCAGCACCTA 17340 

GGACTTGTCT ATACAGGGTT TAGTGGTGCT AATGTTTTTG 17400 

ATCGGCGATT TAGTAGGTTG GCGCTATACA TTTCTATTCT 17460 

GTTGGCTTCT TGATGATGAT CTATTTACCG AAGGATCAGG 17520 

AATCATGAGA CACCATCTCA TGAAAATCAT GTTACTTCGA 17580 
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CAAACTCAGT GACATTCGTC TTTATAAATC CACTTATTTT ATCTAATGGT CATGATATGT 17700, 

CATTCGTTTC ATTAGCACTT CTAGTAAATG GAATCGCTGG CGTTATTGGA ACATCATTAG 17760 

5 

GTGGTATATT CTCCGATAAA ATTACAAGTA AGCGTTGGTT AATGATTTCT GTTTCTATTT 17820 

TTATCGTCAT GATGTTACTT ATGAATTTAA TCTTACCTGG TTCAGGTCTA TTGTTAGCAG 17880 

GACTATTTAT TTGGAATATC ATGCAATGGA GTACTAATCC AGCAGTGCAA AGCGGTGTGA 17940 

10 

TTCAACATGT TGAAGGCGAC ACAAGCCAAG TAATGAGTTG GAACATGTCT AGTTTAAACG 18000 

CTGGTATTGG TGTTGGAGGC ATTATTGGAG GCTTGGTCAT GACACATGTT TCTGTTCAAG 18060 

15 CTATCACATA TACGAGTGCC ATCATTGGCG CATTAGGATT AATCGTTGTT TTCACATTGA 18120 

AAAATAATCA TTATGCTAAA ACATTTAAAT CATCATAATT CTCATATGAm AAGCACGCCT 18180 

GCTATCAAAT TCAGGTGTGC TTTTTTAGAT GCGATAACGT TATTGATATG TGCGATAATA 18240 

20 GCGACGTTCA TTATGATACA TCGGCCAAGG CATTTTACCG CTTTTAGCAA AATTAGCTAA 18300 

ATCATTTTGC ATTTGTCGAC TTAAAAATTT AAGGTGaGCA GTTGTTGGaT ATgAT 18355 
(2) INFORMATION FOR SEQ ID NO: 68: 

25 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1192 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

CGCAAAGAAG TACAAAAAAT GTTTTTACAA GAAGGTATTA AAACACCTCA ACCAATTATG 60 

35 

ACTGCTTATA ATCATAGTGA AAACGgTGTT TAGTAGTTTA TAATACATGG AGGTCATATT 120 

TAATGGCGTC AAAATATGGA ATAAATGATA TAGTAGAAAT GAAAAAACAA CATGCGTGTG 180 

4Q GAACAAACCG TTTTAAGATT ATTAGAATGG GTGCAGACAT AAGAATTAAA TGTGAAAATT 24 0 

GTCAAAGAAG TATTATGATT CCACGTCAAA CGTTTGATAA AAAACTTAAA AAAATCATCG 300 

AATCTCATGA TGATACACAA AGATAGGAGA ATGATTAATG GCTTTAACAG CAGGTATCGT 360 

45- TGGATTGCCA AACGTTGGTA AATCAACATT ATTTAATGCA ATAACAAAAG CAGGTGCTTT 420 

AGCAGCGAAC TATCCATTCG CTACGATTGA TCCTAATGTA GGGATAGTAG AAGTGCCAGA 4 80 

TGCTAGATTA CTTAAATTAG AAGAAATGGT TCAACCTAAA AAGACATTGC CGACTACATT 540 

50 TGAATTTACA GATATCGCTG GTATTGTGAA AGGTGCTTCA AAGGGAGAAG GGTTAGGTAA 600 

TAAATTCTTA TCACATATTA ' G AG AAGTAG A TGCGATTTGT CAGGTCGTTC GTGCATTTGA 660 
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TAATATGGAA TTAGTACTAG CGGACTTAGA ATCTGTTGAG AAACGTTTGC CTAGAATTGA 780 

AAAATTAGCA CGTCAAAAAG ATAAGACTGC TGAAATGGAA GTACGTATTT TAACAACTAT 840 

5 TAAAGAAGCT TTAGAAAATG GTAAACCCGC TCGTAGTATT GACTTTAATG AAGAAGATCA 900 

AAAATGGGTG AATCAAGCGC AATTACTGAC TTCTAAAAAA ATGCTTTATA TCGCTAATGT 960 

TGGTGAAGAT GAAATTGGTG ATGATGATAA TGATAAAGTA AAAGCGATTC GTGAATATGC 1020 

10 

AGCGCAAGAA GACTCTGAAG TGATTGTTAT TAGTGCAAAA ATTGAAGAAG AAATTGCTAC 1080 

ATTAGATGAT GAAGATAAAG AAATGTTCTT AGAAGaTTTA GGTATCGaAG AACCAGGATT 1140 

AGATCgrTTA ATTAGGAmCA CtTATGAATT ATTAGGnTTA TCCACCATAA TT 1192 

15 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 94 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



25 . (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



30 



45 



AATATAGCTG 


CAATAGCATC 


TCGTTTCATT 


TGTATAATCA 


ATTCCGGTTT 


AAATATCAGT 


60 


GTGAACGTAA 


GCACGACACA 


GATTAAAAAT 


AACACTGCCG 


GAATGAGTCG 


TTTCAATCGT 


120 


CGCTtCCAAA 


ACTCTAGCAA 


ATCGATTTTT 


TGCGTCCGAT 


AATACTCACT 


TATCAACAAA 


180 


CTTGTTATTA 


AATAACCTGA 


AATAACGAAG 


AATGTATCTA 


CTCCTAAAAA 


GCCCCCACTT 


240 


AACCATTGTG 


CATTCAAGTG 


ATAAATAATG 


ATTCCTATAA 


CTGCGAATGC 


CCTCAATCCA 


300 


TCTAATCCAG 


GTAAGTATCG 


CGGGGAATAC 


ATTTTTTCTA 


AACGTTTAAA 


GTCTTTTGTA 


360 


TCCA'fGTTAA 


TAAACGCCCC 


ATTTATTTTT 


CTCTATTTTG 


TAGTATATCA 


CAATATTTTT 


420 


GAAAATAAAA 


TATTGCACTG 


aTTTTCATTA 


ATTGATTTAA 


CCCTTAATTA 


AGATAGTTTT 


480 


AAATTTTTTA 


TTAAGTAGAA 


AACAATTATT 


ACAGTTGATT 


TCATTACTGC 


AAACCACATA 


540 


TAAATTTGTC 


GATTTTACTA 


CATAACATAG 


ATTATCATAG 


ATTCTTGAAT 


TTTTAGCAAA 


600 


ATAACTGTTA 


TTTTCATTAT 


ATTTTTACAA 


AAAAAGGTTC 


GTTTTATATT 


TTATGCATCT 


660 


TACTGTAACA 


GAATCATTAA 


GATATGCTAT 


TCGAATATAC 


TTTTTCAAAA 


TTTATATAAT 


720 


GAATAAATTA 


ACATGTATTG 


AAAAAAAAGC 


GAAATGCAGC 


CTATCCTCTA 


ATGTAAACCA 


780 


AACGATATAT 


CTCGTCAGAC 


TTTATATTTA 


AACGCTATGT 


GTCACTTTTA 


AAATGAATAT 


840 


TACTAAGATT 


GTCATATCAA 


TTATTATTGC 


ATCGAATTAA 


TCTTTTAAAT 


TTCTGTAATA 


900 
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ACGGAAGTCA TTATTAGAAT AAAAATACTG 


TGCACTAATA AATTTATCAA 


TTGTTCCTAA 


1020 




ATAAATACCA 


TCGATATTTT GTTCTTTACA 


TGTCATTATA 


ACTTTATCTA 


AAAGTTTTTT 


1080 


5 


ACCTATTTTT 


AAATTCCTAT AACCTTTATC 


AACAAACATT 


TTTTTAAGTG 


CAGACATATT 


1140 




ATTATCTAGT 


CTAATCAAAC CTATAGTACC 


AACAATATTT 


TGaTGATTGT 


TTATTGCAAG 


1200 


10 


CCAAAATgCC CTCCATTATT CAAATAGTTA TGTTCGATGT TCTCCAAATC 


AGGTTGATCA 


1260 


TCTCTATCAA 


TTTTTATATa AATTCATTTT 


TTTGAATCGA TAAAATAAAC 


TCGATTAGCT 


1320 




CTTCCTTATA AGACCTATTA TATTCAATTA TGTTTATAGC CATTTTTATC 


TCCTTTTTCA 


1380 


15 


TTTAATTTAA 


TTATAAAATG TGCGTTTAGT 


TTGTATCTAG 


TGTACTCAGT 


ACAGCCTCAA 


1440 




ATGAAGTTTC 


ATTCCACTTG GCACTTAATA 


AAGACAAGTA 


TTTTAGCAGT 


AATACAATAA 


1500 




AGTCCAATAA 


ATTTCCCTAA CTTCAATATC 


CACTTTTTAA 


AAAATGTATT 


TTTAATTAAT 


1560 


20 


AAAAAAACTC 


TCCCCAATTT CTATGGGAAG AGCTATATAT 


TTAATGTCTA 


AACATTACTT 


1620 




TTATTTATTA TGAAGGAATT AGAATCCCCA AGCACCTAAA CCTTGTGCTT 


TGTATGCTTT 


1680 




AACAGCTGCG 


TTGATTTGTT GGTCAACAGT 


GTTTGTTGGA 


CCCCAACCTG 


GCATAGTTTG 


1740 


25 


GAATAAACCT 


GAAGCACCTG ATGGGTTGTA 


AGCATTTACT 


TGACCATTTG 


ATTCACGAGC 


1800 




GATGATTGCA 


GCCCATGTAG AAGCTGAAAC 


ACCAGTACGT 


TGAGCCATGA 


TTTGAGCTGC 


1860 




TGATGAACCA 


GTAGCACCTG CAGTATTACC 


ATTGCTTAAT 


CTCACTGAAC 


TTGAAGTAGT 


1920 


30 


TGAAGTGCTG 


TAGTTATGGT AAGTTGGAGC 


TGAAACAGCT 


TCAACGTtTG 


AGTTACTTGA 


1980 




TTGTGCATTG 


TAGCTTACTG ATTGTACATT 


TGAACCTTGG 


TTGTATGAAG 


TAGTGTAGTC 


2040 


35 


TGCACCTGCA 


ACGTTTGAGA AACCAGCAGT 


TTGACCATTA 


GCTGCTTCAT 


AGCTCCATGA 


2100 


CCATGTAGTA 


CCATTTGAAG TGAAGTTATA 


TTGGAAACCA 


TCTTTTACAA 


AGTGGATGTC 


2160 




ATATGCACCA 


TCTTTGATTG GAGCTGCATT 


TAATTGATCT 


TGGTGATTAT 


GCGCTAAGTC 


2220 


40 


AACTAAGTGT 


GCTTGATCAA CGTTTACTTC AGCAGCGTGT 


GCTTGATGTC 


CTGTACCTGC 


2280 




TGCGTAACCT 


GTTACACCTA ATGCCACTGC 


TAATGATGAT 


GCCATAATTG 


TCTTTTTCAT 


2340 




AGTAAAAAAT 


CCTCCAGTAA TAATTGTnAG 


TTTATGTTTT 


TAGTAATTAT 


AtTTTGaATT 


2400 


45 


TGAATGTCGT AGTgCAAGTT TAAATTGTCT TTTATTTCTT TCaACGGTAC TCACTATATC 


2460 




ACAaAAAACC 


AGCCAGTAAA TTACACTTTC 


TTTACAAAAC 


ATTACAATAT 


CAAGTGTTAT 


2520 




TTGtAATGTT 


GAAATATGGC TGTTTTATAC 


TGTAATGTGA 


AATATGTGCC 


CTTTAGAATC 


2580 


50 


CAATCAACCC 


TTGAAATAGT CTTTAACACA 


TAAGATTTTT 


ACTATATTTA 


GCTCAACTAT 


2640 




TACAGCTTTC 


GTAATATTAC AGATTGTATT 


TTTGTTACAT 


AGCTGTAATA 


TATCTGACAT 


2700 
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TACACATGTA TTGATTGCTA TTATTGTTGT 
GTGAATTGTC TTATCTTTTA TTAGCGCAAA 
TCATTCTCGC AATTCACAAT AACATTAAAT 
ACTGTTATTA TCCCATGAAT TTAAAAATAT 
AAATGATATC TGCAAGCCAA GCTGTTACAA 
GTAAGACACT CAACCAAGCA GAATCAACCT 
CCATTCCTAT AAAACCAAAT CCAGCTGACT 
ATACCAAACC TGATACAATG GCTGTCGTTA 
CCATATTAGG TATCATCATT TTAACGCCTC 
GATTCACTTT ACTTGTACCA ATTATCAATA 
CTGATCCAGC TGCTAAACCT GTAATACCTA 
GCGAAATAAT AATAAGACTA AATACCATTG 
ATTCTGTAAA ACCATTAACC AT ATT AC CG A 
ATATTAAAAC ACCAATTGCA CCTGAAATAC 
CCATACTACC TACGCGATGT TGAATAAGTA 
TCATTGTATT AATTAAATCA CCAATACCCG 
CTGCACCGCT TCCTACATAT GCTGCACTTG 
CAAATTTCAT GGCAACCAAT GCACCAATCA 
CAACGCCTAA TAACGTTTTA AAAATCGGAT 
CAAGTATCGC ATTAGGAACT AAACCCGCAA 
TAAATATAAA ATCTTTGGGT GTAATTGTTT 
TATGTATATA CATCTGTATG CAAATAATAA 
TAAATTCTAA GATGTGCATG CCGATGTTGT 
TCAAGACTAT GAAAAATAGT ATATCACAAA 
ATTTTTCAAA CATATTGTTA CAATACACTT 
ATACAATAGA AGAAAGACAT TCAAATGCTT 
TATCAGCaCT TACATATCAT CAACACAATT 
TGTTAAAACA ACAGATGTTA GGTAGTGAAC 
GAATCGTTAG AACCAAACAA TTGCTTGTCG 



ATATTCAAAG TTTTAAAACA CACATCTTTT 
TAAACTGCAG CTCAATTATA TTGTTCAACT 
AATTTTTGGT CTCATATTTT CAAAAAACAT 
CATTAGTATA TAAACGAAAC ACTTTACGAT 
ATGGTACAAC AAAGAACGCT ACTACAATTA 
CCATAAATTT AAATGCATTA ATCGGTCCTA 
CTTTCGTTCC ATGAATACCT ACTAATGCTG 
ATATTGGTAA CATAAGAATT GGATATTTCA 
CAAAGAAGAC GGATAACGGC ACCCCTAAAC 
CTGCTTCAGT CGCGGAGATA CCAATTGACG 
TCGCAAAGGC AATGGCCACA GTTGATAGTG 
AAATCAAAAT ACTCATGACA ATCGGTTGTA 
TGGCTGTTGT AATCATTTTC GTATACGGCA 
CGCCAACAAC TGTTGGGAAT ACAATCAATG 
AAATGAATAA CACTGCAATC GCTGCTGTAA 
TAATCATCCA AGCACCATTT TTAAACTGCG 
CCACAACAGC AATTGCTAAT GGCGATAGGT 
AAGCAGGTAC TGTAAATTGA ATTGCAACGA 
GATAATCCAT AAAGTATTTA AAAATTTCTC 
CAATACCTAT GGCGACACCT GATAAAACTC 
TAATTGATGT CATAATATCA TCCTTCCATT 
AGAGCCTTAA GTTATAAGCT GCCACTAGCT 
TATATTTAGG CTAGCAGTAT CATCTATAAC 
ATTCTGAATT TTTAGATAAA TAAATTGGCA 
TTATTTTATC TTCATTTTTA AAATGCATTA 
ACCAAAAAGG TACATTATTT GTTAGGAGCG 
GACAATATAA TAGAAGATAC TGATAATAAG 
AAATGATGGA AAGTAAATCC ATAGATCCAA 
ATGCTTTTCT TAAAATTTCT AGAGAAAAGA 
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TTTACGCTCA TTTCGCTGAT AAAGAAGACC TCCTAGACTA CACATTATCT GTAACCATTT 


4620 




TAAAAGACTT 


GAATGATAAT 


TTGAGCATTT 


CTAATGTCAT 


TAATGAAAAG 


GTTCTGCGTA 


4680 


c 
o 


ATATTTTCAT 


TTCAATTGCG 


AGTTATATCA AAGATGCTGC 


AAAGTCTTGC 


GAATTAAATA 


4740 




GTGAAGCATT 


TTGCAACAAA 


GCACATCAAC 


GTATTAATAA 


TGAATTAGAA 


GATATTTTTG 


4800 


10 


CGATTATGTT 


AGAAAACAGC 


TATCCGGAGC 


ATCAACGAGA 


TATCATTGTA 


AATAGTGCGA 


4860 




GTTTTTTAGC AGCTGGTATC 


TCAGGCTTAG 


CATTACATTG 


GTTTAACACG 


AGTCAAGAGA 


4920 




CAGCCGATGT 


GTTTATCGAT 


CGCAACCTTC 


CATTTTTAAT 


TCATCATATA 


GCACATTTTT 


4980 


15 


AATAAAACTT 


GGTATTTAGT 


CATGCATCTT 


GAAATCACTA 


TGTGACTTAG 


GTTCATACTT 


5040 




GTACACACAA 


TAAAATTTAA 


CGTATTACGA 


TTGATTAGCC 


GTGTCTAGGA 


CATAAATCAA 


5100 




CGTCCTATAC 


TCTACAATGT 


CATATTAGCA 


GTCGTTAACT 


GAATGAAAAT 


AAGCTTGTCA 


5160 


20 


TTAAAACATA 


TAGATTTTAG 


TGACAAGCAT 


TTTTGTTTTT 


GCGTACTTAA 


ACAACACTTC 


5220 




AGGCAATATG 


TTGTTTAGGC 


AACAAATGAT 


ATGTGCGTGT 


TTATTGGCAA 


ACGTACGACA 


5280 




TAGTAGTATA 


GTATGTCTAA ACAACATATG 


TTGCATAGTT 


GATATGCGTT 


GTTTAAATAC 


5340 


25 


TAAGATAGGA 


GGGATTGACG 


TGAGCGAGAC 


AGATGAACCT 


CAGGGGTTTG 


AACGCACGCA 


5400 




TAATATATTA 


AATATTAATC 


AGAGTAGTCT 


GGGTGTAGTG 


ACATACATTA 


CAAATAAATT 


5460 


30 


AAAGTCGACG 


TTGAAGCAAC 


ACATAATAAT 


TGCTCGTGGT 


AAAAAGCGAA 


TCGACTATCG 


5520 


ACTGTCGTAT 


AACTTTTACA 


TACGTATTAT 


GATAATGTAG 


AAATCAAGAA 


AATCGACTGT 


5580 




GAATATACCT 


ATGCTATGCC 


CATTGCAATT 


TTAATAAGAC 


ACACGATGTC 


ATTCGACAAT 


5640 


35 


GCTCATTTCT 


TTGCTCAGTT 


ACGTCATCCT 


GTCTTATAAA 


ACAACATTGC 


AG ACATGT AT 


5700 


ATCAAACGAC 


ACTTCAATAA 


CATCACTTTG 


CCcATCGTAC 


TACTAGTAAA 


ATCGTGTCTC 


5760 




AAATCCCTTA 


TTTTAATTCC 


AAAAAtCTGC 


TGGTCAAAAG 


ACCGAGAAAC 


TAAAAACATT 


5820 


40 


ACTTAATGTG 


TTGATAAATT 


ACCATATAAA 


AATAATCTCA 


AAATATATCA 


ACACTTGATT 


5880 




CTAAGGAGGA 


TATGACAATA 


TGAAAATTTT AGATAGAATT 


AATGAACTTG 


CAAATAAAGA 


5940 




AAAAGTACAA 


CCACTTACTG 


TAGCTGAAAA 


ACAAGAACAA 


CATGCATTGC 


GTCAAGACTA 


6000 


45 


CTTAAGcATG 


ATCCGAGGAC 


AAGTATTAAC 


AACATTTTCC 


ACAATAAAAG 


TGGTTGATCC 


6060 




AATCGGTcAG 


GATGTCACAC 


CAGATAAAGT 


TTATGATCTT 


CGCCAACAAT 


ACGGTTATAT 


6120 




TCaAAATTAA 


tATTTGCTCA 


CGAGGTATTG 


CACTTAAGGT 


GCCAACTGAC 


CTCATAAACA 


6180 


50 


AAGCCCATAC 


TGATTGAAGA 


CACTAATGTG 


tCsaCCATGG 


TGCACATTAC 


GCTTCATCTC 


6240 




TGTATGGGCT 


TTTTATTTAT 


TCTTTTGAGA ATTTCATTTT 


AGCAGACCAA 


AAAATTAAAA 


6300 
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TGAACGACTG TGCCACCCGC TTCTTTCACT TTATTCACCA ACTGGTCAAC TTCTTCATTT 6420 

GTGTTCACAC CTAGAGAAAT CATCACTTCA TTTGGTTCAG TATTAAGGCT TTGCTGACTT 64 80 

5 

ACATTTTGAA AATGCTTGTn TTCTATTAAA ATTACGGkTG tTTGACCTAT tTGAATGCCG 6540 

ACCATTTTAT CTAACATTTG TGGGTTTCTA TTTATTTTAA ATCCTAACGC TTTATAAAAC 6600 

TGTGCGCTCT TTTCTAAATC TTGCACATGC AAATTAAACC ACATTGATTG AATCATGATT 6660 

GCACCCCATT CATTACTTAT TATAGTTTTG GACTTTAAGC CAATCACTTA ATGATAATCT 6720 

TGTTGGATTT ATTTCAGCCA TTAATTCAAA GTCTACTTCA TAACCTTTTT CTTCCAACCA 6780 

15 TTGCTTTTCT GCAACACCAC TAACAAATTC TCCTTCTATA ACAGTAGATT TACCTGTCAC 6840 

TTCACTAAAA ATTGTTGCTG CTTCACTTAA TGTAACTTCA TCGGAACCAA TCTCTATTGA 6900 

TTGATGCGTA AAGCTTTGTG GATGTGCAAA AATATACGAT GCAATTTTAG CTATATCAAT 6960 

20 AGAAGAAATC ATTGTGAATT TTATATTCGG ATTAATAAAT TCTGGTAATG TAATACGTTC 7020 

ATCTTCGACT TTAGCAATGC GTAAAAAATT ATCCATAAAG AATGATGGTT TGATAACTGT 7080 

TGCATTTATA TTAGATTCCA TTAATCTATT TTCTATTTTT GCTAGTACTT CAAAGTGTGG 7140 

25 GCCAGTTCGA TTTCGATTAA CCCCTCCCGC AGTACTATAC ACAATATGTT GAATATTTTC 7200 

TTGCTCAGCT ATTTCAATTA TCTTCATACC TTGTCTTAAT TCTTCGCTAA CATCATCTTT 7260 

AACGATTGGC TGAATACTGT ATAAGCCATA CTTACCTTTC ATCGCTGATT GCAAACTAAC 7320 

30 

ATTATCACTC AGATCACCTT CArCGATTGA TAAATGCGGA TGTCCTATGT CTGAAAGTTT 7380 

ACGATTATnC TTATTTCTAG TTAATGCACT TACATACCAT CCATCCTCTA ACAACTGTTT 7440 

TACAACTGCA TTACCTTGCT TCCCTGTTGC GCCTATTACn AAAATATCTT TCAT 7494 

35 

(2) INFORMATION FOR SEQ ID NO: 70: 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11802 base pairs 

(B) TYPE: nucleic acid 

40 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

AATTTATTTC GCCGTCCCAC CCCAACTTGC ATTGTCTGTA GAAATTGGGA ATCCAATTTC 60 

TCTTTGTTGG GGCCCcGCCC CAACTCGCAT TGCCTGTAGA ATTTCTTTTC GAAATTCTCT 120 

SO GTGTTGGGGC CCCTGACTAG AATTGAAAAA AGCTTATTAC AAGCGCATTT TCGTTCAGTC 180 

AATTACTGCC AATATAACTT CGTAGATCAT AGAACATTGA TTTATTTCCC AGCCTATTCT 240 
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AGCAAAGGTA ATAATGATAT TAATAATGTA 

TAAAACATCA GAACCACTAA AAACAAAAAA 

5 

GACCACTTTT CAAAAAAATC TCtTTTCaTa 

TTATATTCTC TTTTAAGTTT ATTATTCAAA 

ATAAACATTT CAACTACTTT TAAAAACCAA 

10 

ATAAGTGAAC ATAGTTCTTT AGTTATAATA 
GCAATTGGTT TTCATTTCCT CTTAAAGATA 

?5 CTATATTTTT CAACTTATCT CTATATTTAT 

CCTCTTCTTC GTGAGTTAAT AAATGAAGCA 
TTAAATTCGG TTTTAAAATA TGCAAATCAT 

20 CTCGTTTTAA TTCAATTTCC ACACGCCATA 

TATCTTTACG TTCTTGTTTT TATTATAAAT 
AAAATATTTT GTTTCTGGTT TTACATTACG 

25 

ATCTGACAAT GCATAATAGT CATTTAAATC 
CGTAAAACTA ACATCGTCCA AATAACTGAT 

3Q ATGCGAAAGC TTATTAGGAT TAAATTCAAC 
TTTATTTTGT CATATTCAAT ATAAACTTTT 
TGTAAAATAT CCCAAAGCCG AATTTCAGGA 

35 GCGTTAGACA TGCTAAGATT CCCAACAATC 
GCTAGTGACA TCCTATGTCG ATTTAACCGG 
ACAAATGGAT GAAACGAAAT TCAAAACACT 

40 

TACCATTATG TTCCTACTAA AAAACyAAAA 
TAGGATACTA TGTAATAAAA ATTTACAATA 
GmATACCCAT ACAAAGAGGA TAAAATAAAA 

45 

CTCGAGGTTT AAATATTGGT GCCTTATTTA 
TCATTAACmT AATCCTTAAA GAGTTTTAAA 
50 TCATCAACTT TTAAATAATT CAATAATTTT 
AACTTTAATA AACTATTCAT TTTGACAGGA 
AATACTTTCT CGCTTTAnAC AAAnACAAAA 

55 



CAAAAAATAT 


AAATCAAATC 


GACATCCTTA 


360 


GCACAAAATA 


AAATTAAATT 


TAAAATAAAC 


420 


TTTCCACCCC 


TAATTTTAAT 


AAGCATTATT 


480 


AGGAAAACAG 


AAATATCTTT 


CaATATTATT 


540 


CAAAAAAATA 


CTTATTTTAA 


GTAGATGAGC 


600 


ATTAATTCAA 


CCAAAAGTCG 


ATTTGTTTTT 


660 


TTTTCATTAA 


ATCTGTCAAA 


TCAATAGACG 


720 


TTTTAGTACG 


TCTTTCTAAA 


TTTCCCCATT 


780 


TTGCTCGTTC 


TTGTATATTT 


TCAATCATTT 


840 


CAAAACAATC 


TTTCCAACAA 


TCAACCATAT 


900 


GAAATGTTGA 


ATCAATTTCA 


ACATCTGCAT 


960 


CCGAATAAAC 


CTATCACTAT 


TACGCACACC 


1020 


TCCATAAAAT 


ATAGTTTTCT 


TTACCGACTT 


1080 


AAATTCAAAA 


TCAAAAGCCA 


AATCTAATCT 


1140 


GATATTTTGT 


TTTAACCAAA 


GCACTTCATC 


1200 


GCGCATAtAC 


GTCTATTCCA 


AAGAGTTGCT 


1260 


TCTTTAAGAG 


CTTTAGCTTT 


AAAGTTTGTT 


1320 


TTAGTACTCA 


TAAAATGTGA 


AAGTCTCTCT 


1380 


GTTATAGCGT 


CAAAAGACAA 


TTTTGGAATA 


1440 


CTATTACCGG 


ATATTAGAGT 


ATCCAGTTTT 


1500 


AAAAAATATG 


TTCCACTAAC 


AGCAAAAAAA 


1560 


ATACTGGAGA 


ACAAATGTCA 


GGATATAACT 


1620 


AAAAAACAGG 


AAAACAAATT 


TCAAGTAAAA 


1680 


AACCTCGAAC 


TGaAATGATG 


ATCTTTTCAG 


1740 


TATAGATTCG 


TTATATTATA 


TTCTCTATTT 


1800 


TTAATACCTG 


CTAGATGATT 


CAAAAATGTT 


1860 


TGTGGTGTCA 


GTAAATnTCT 


ATCAAAATAC 


1920 


CGTGACATTT 


CAATCACGTC 


GTCTAAAGAT 


1980 


ACTTACCCGA 


TTAAAATCAA 


GTAAGTTTTA 


2040 
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10 



20 



25 



TATTTGATAA AAAATCAATA AGTAATTGTG CGCCTTCAAC TTGAATATCT TTTACAACTG 2160 

GCGCGTCGAT ATACATATCA TACTGACCAC CGCCTACTGC ACGATAATTA TTTACACAAA 2220 

TTGTATATGT CTGCTTTAAA TCAACTGCGT GACCTTGAAT CATCATATTG CTCACACGTT 2280 

GTCCCTTTGG TCTTCCAACA TGAATGGTAT AACTTACGCC ACCATATATA TCATAATTAA 2340 

AGTGTTGTGG TTTGGGTTCA AGGAAGTCTG CGCTCACACT AACTTCATCA TTTTTCACGT 24 00 

CAAAATATTC TGCTGATCGT TCAATGGCTT CTTTAAGTTT GGCACCACTT ACAGCTAAAA 2460 

CTTTAAATGT ATTTGGAAAT GGGTAATTGT TAATAACATC TCGCATCGTC ACGACTTGCT 2520 

15 TGAAACCACT AGCAGAATCA AACAAAGCTG TACAGGCAAC ATCTGCGTCA CTTTTTTCTA 2580 

ATAAAGCGTA ATTCATAAAA TTTGTAAAAG GATGCGGTGC CACACGTGCC TCAAATGCAT 2640 

GATTAATCGT CATATCATAT GGCAATGTAG TAATTTCGTA ATCTAACCAG TCCTCTAACT 2700 

GCTTTCGTAA ATGTTGGTCA TCTTCATCAA TAGTAAATGT GGAATCATCT ATAACAGGAA 2760 

GTAATTCACA TGATTCAACG GATAGATTTT CAT ATT CATC AGTACTCAAG ACTACTCTGC 2820 

CTACAGTTGT ACCTCTCGTA CCAGGTTGAA TCACAGCCGT TTGCTTAAAC CTTTCAGCAA 2880 

TTTGTCGATG TTGGTGACCC GTAATAAAGA TATCTATATC TTTAGAAAAC GCTTCTAACA 294 0 

TGGCATATCC TTCATTTTCA CCCGTTAATA CTTCGGTCGG CGTACCACTT TCTAAATCCT 3000 

30 TTTCAAATCC ACCATGGTAA CAAACCACAA TGATATCTGC ATGTCGCTTC ATTTCAGGTA 306 0 

AGTATTGTTG AAGTATTTCA AAAGCACTAT GAAACGTArT GnCnTGAATA TGCTCTGGTT 3120 

GTTCCCAATG GGGAATAAAT TGTGTCGTTA AACCTATCAC ACCAACAGTT TGATCTCCAA 3180 

35 CCTGAAAATA CTTCACACCG TTATCAGTCA ATGTACTATC ATTTTCATAT ATATTAGCGC 324 0 

ACAAAACTGG ATAATTGAGT CTGCGTAAAG TGTCTTTTAA GTATGGTAAT CCATAATTAA 3300 

ATTCATGATT ACCAAGCGTA CCAAAGTCGA ATGCCATTCG ATTATAAAAA TCAACTAAAG 3360 

GCTGGCTACT GCCGCTATGC GCGATTAAGT AATTACAAAA TGGTGACCCT TGCAAAAAAT 3420 

CACCATTATC TATTTTAAAA CTTTGGTCAT ACTGCCTTCT GTsTTGTTCT ATAACATGAT 34 80 

TCGCTAGTAA CAATCCCATA GGTTGATATT GATTTCTACT CGTAAAATCT GTTGGGAAAA 3540 

TATAACCATG TACGTCACTC ACGACATAAA ATGCTATGTT TGACATCCTC ACTCACTCCT 3600 

TCAATCACAA ACATCTTTCT TATTTCTATT ATATATTTAT TTGAAGTCTG TTGTAATCAA 3660 

50 GGTTTTGTCA CCGAGTTTTA AACGAATCTT TGAACCTTCC ATACTTTCAA GTACTTTAGC 3720 

ATTGACCTTA ATTGTGACAT TTCCGTTTTC ATCTGCTTTA ACTGTTGGCA AAGTACTGTA 3780 

ACCTGGTGGG TTATAATCGT TATCTTTACT TGAAAATTGT CCGATTTGAC GTCCGCCTTC 3840 
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TATTGTCATT TCAAATGGCT CATTTACAGA AACATTTTGC GGGATATCAA 


ATGTTACTTT 


3960 




TTCGTTCTGA TTTGGTGGTG TATGATCATC 


TGGTGTGTTT GG CTGAGGAT 


CTGCGCCTTT 


4020 


£ 


TTCGCTGCCA TAACTACCTG CTTTAAATGT 


TGTTGGATCA TACCATTTAT 


AACCACTCGG 


4080 




CGGTTGTGAC CATGGCTCTT TTTCAGGCTC 


AGTTGAACGC TCTGGTCGTT 


CAAAATCAAG 


4140 


10 


CAACTTAGTC TTTGTATCTA ATGTTAGGCT ACTCGCCTTA AGTGATTTCC CATCATTATC 


4200 




TTTAGACATC CAAGCCGTTA TATTATTTAA 


TAGCTTACCG TTGTCTTGTT 


CTTTAAAACC 


4260 




ATCATATGTT TTCTTCTTTT CTCCATTATC 


TTCTCTTACA TATTTGGGCG 


AACTATCTTC 


4320 


15 


CACAAGTGAT GAATCACCGA TAAATGCTGC TTTACCTTTT CCAACTTTAG 


AAATTGCTAC 


4380 




ATAGGGGCCT TCTGCTTTAC CGCCCCCATT 


ATAAATACCT TGATCTACAG 


CATGTGACCA 


4440 




TTTACTTTTC GCTGGCAATT GTTCTGGTGT ATACACAATA CCTTTTGCTT 


TCTCTGGATT 


4500 


20 


AGTAATTGCT AATGTCGATC CGGCATGCAT 


AGAGACAGAT TTCACACCTT 


CAGTAATACC 


4560 




GAAACTTTCT TTTGAAGAAA CAATATTGCT 


CGTATTTAAA TCACCTAGTG 


CATTATATCG 


4620 


25 


AAAACGTACG CCAAAGTTTG TAGATAACCA ATCTGAACTT TTCACACCTT 


GCATTGCAGT 


4680 


AGAACTTTTT TCTTCTGCAT TCATACCTTT 


CGACATATCT TCATATGCTC 


CACGTCGATA 


4740 




ACCATTCATT GCCTCCGATG AATCAATACG 


ATTTAAATTT CGGTCAGCAT 


TGTAATGATC 


4800 


30 


TGAAATAAAG ACAACATTGC CACCTTGTTt 


CACATATTTA ACAATTGCTG 


CCTGTTCTGA 


4860 




TTCTTTGAAA GGAATGTTAG CCTCAGGAAT 


TACAAATATT TTGGAACTTT 


TCAAACTTGC 


4920 




TTCTGTTATG TTCGAATGAC CATCAATAGC 


TTTAACGTCA TAACCTTGTT 


TTTGTATTGA 


4980 


35 


ATCCGCATAA TCTGAAAATG CACCATCACT 


AACCCAATCT GCAGCACCAG 


CTGTTTGACC 


5.040 




ATGAGAACGA TCGAATAATA CCGTTCGCTG 


TTGCTTTGTA GGTTGCGATT 


CATGCGTTAT 


5100 


40 


AGCHVAAGAT TGCGGTAAAG CACTTAATGA 


TACCGTTGCA ACAATTGCAG 


AGACAGTTAA 


5160 


TGACTTATAT ATTTTTTTCA TTTTGTGAGG 


CTCCTTTTAA AATAAATTTG 


TTCTTGAATT 


5220 




ATAGGATAAA AATTCGTTGC ATATGAGCAA TTTAACGAAA AATTTACAAA ATCTTATCAA 


5280 


45 


ACTCTTAAAG AAAGTTATTA AAATTCATTT 


TTATAAAATA CTTTTTAACA 


TTTAAATGTG 


5340 




GTACGCTATA AGTGTAATTT CATTGCATAC ATATTACACG ATTAAGAATG 


TGAAGGGGAC 


5400 




AGTTATCAAA TGAAAAATTT TAAGTGTTTA 


TTTGTATTAA TGTTAGCAGT 


CATTGTTTTT 


5460 


50 


GCAGCAGCAT GTGGAAACTC AAGTTCTTTA GATAATCAAA AGAACGCTAG TAATGATTCG 


5520 




GATTCTAAAT CAGGAGGATA CAAACCTAAA GAATTAACCG TTCAATTTGT 


ACCTTCGCAA 


5580 




AATGCTGGAA CATTAGAAGC TAAAGCAAAA 


CCATTAGAAA AATTACTATC 


TAAAGAATTA 


5640 
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TCTAAAAAAG TTGATGTTGG TTTCTTACCA CCAACGGCAT ACACATTAGC ACATGATCAA 5760 

AAAGCAGCTG ATTTATTATT ACAAGCACAA CGTTTCGGTG TAAAAGAAGA TGGTTCAGCA 5820 

5 

AGTAAAGAAC TTGTAGATAG TTATAAATCA GAAATTCTTG TTAAAAAAGA CTCAAAAATT 5880 

AAAAGCTTGA AAGATTTAAA AGGTAAGAAA ATTGCCTTAC AAGATGTAAC ATCAACTGCT 5940 

GGATATACAT TCCCACTTGC GATGTTAAAA AACGAAGCAG GTATTAATGC AACTAAAGAT 6000 

10 . 

ATGAAAATTG TGAATGTTAA AGGTCATGAC CAAGCAGTTA TCTCATTATT AAATGGAGAt 6060 

GTAGATGCTG CGGCTGTATT TAACGATGCA CGTAATACTG TGAAAAAAGA CCAACCAAAT 6120 

75 GTATTTAAAG ACACACGAAT TTTAAAATTA ACACAAGCTA TTCCGAATGA CACAATTTCT 6180 

GTAAGACCAG ATATGGATAA AGATTTTCAA GAAAAATTGA AAAAAGCTTT TATAGACATT 6240 

GCTAAATCAA AAGAAGGTCA CAAAATTATT AGCGAAGTTT ATTCACATGA AGGATACACA 63 00 

20 _ 

GAAACGAAAG ATTCAAATTT CGACATTGTA AGAGAGTACG AAAAATTAGT TAAAGATATG 6360 

AAATAATCAT TATTTAACAA ATGAATCATT AGCGAATTTG GTATTAAAAG CTTTCGTTCA 6420 

ATAGATATAT TCTAGATTAA TATTGAAAAG CTAGGCGCTA AACTGAAACA GATATAGAAA 6480 

25 

GGTGTCGCTG TACATTTGAA ACCATTTGTA CACAGAAACC CAATGTCTAT GATATTTCAG 6540 

TTTACCTTGG CTTTTCTTTA TTAAAGAAAG GTGTCAAACA TGAGTCAAAT CGAATTTAAA 6600 

3Q AACGTCAGTA AAGTCTATCC TAACGGTCAT GTAGGCTTGA AAAATATTAA CTTAAATATT 6660 

GAAAAAGGTG AATTTGCAGT TATTGTCGGA CTATCTGGTG CTGGGAAATC CACGTTATTA 6720 

AGATCTGTAA ATCGTTTGCA TGATATCACG TCAGGTGAAA TTTTCATCCA AGGTAAATCA 6780 

35 ATCACTAAAG CCCATGGTAA AGCATTATTA GAAATGCGCC GAAATATAGG TATGATTTTC 6840 

CAACATTTTA ATTTAGTTAA ACGGTCAAGT GTATTACGAA ATGTACTAAG TGGACGTGTA 6900 

GGTTATCACC CTACTTGGAA AATGGTATTA GGTTTATTCC CAAAAGAAGA CAAAATTAAG 6960 

40 

GCAATGGATG CACTAGAACG CGTCAATATC TTAGATAAAT ATAATCAACG CTCTGATGAA 7020 

TTATCAGGTG GCCAACAACA ACGTATATCT ATTGCACGTG CGCTATGCCA AGAATCTGAA 7080 

ATTATTCTTG CAGATGAACC AGTTGCTTCA TTAGACCCAT TAACTACGAA ACAGGTTATG 7140 

45 

GATGATTTAA GAAAAATCAA CCAAGAATTA GGCATCACAA TTTTAATTAA TTTACATTTT 7200 

GTTGACTTGG CAAAAGAATA TGGCACACGC ATCATTGGTT TACGTGATGG TGAAGTTGTC 7260 

50 TATGATGGTC CTGCATCTGA AGCAACAGAT GACGTATTTA GTGAAATATA TGGACGTACA 7320 

ATTAAAGAAG ATGAAAAGCT AGGAGTGAAC TAACATGCCT TTAGAAATAC CTACAAAGTA 7380 

TGACTCCCTT TTAAAGAAAA AGGTTTCTTT AAAAACGAGT TTTACCTTCA TGTTAATCAT 7440 
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15 



AATACCTCAA ATAGGTGATC TATTCAAACA AATGATTCCA CCTGATTTCG AGTATTTACA 7560 

ACAAATTACA ACGCCAATGT TAGATACCAT TCGAATGGcT ATCGTAAGTA CAGTATTAGG 7620 

TAGCATCGTT TCAATACCAA TTGCGTTATT ATGTGCTAGC AATATCGTTC ATCAAAAGTG 7660 

GATTTCAATA CCCTCGCGCT TTATTTTAAA TATAGTTCGT ACTATTCCAG ATTTGTTATT 7740 

AGCAGCAATC TTTGTGGCTG TATTTGGAAT CGGTCAAATT CCAGGGATAT TAGCACTGTT 7800 

TATTTTAACT ATCTGTATTA TTGGAAAATT ATTATATGAA TCATTGGAAA CGATAGATCC 7860 

AGGTCCAATG GAAGCAATGA CGGCTGTTGG CGCTAATAAA ATAAAATGGA TTGTTTTCGG 7920 

TGTTGTACCA CAAGCCATAT CGTCATTTAT GTCATACGTA TTATATGCAT TTGAAGTAAA 7980 

TATACGTGCT TCAGCTGTGC TTGGATTAGT CGGCGCTGGC GGTATTGGAT TGTTTTATGA 804 0 

TCAAACACTT GGTTTATTTC AATATCCAAA AACAGCAACG ATTATTTTAT TTACTTTAGT 8100 

20 TATCGTCGTC GTCATTGATT ACATCAGTAC GAAAGTGAGG GCACATCTCG CATGACACAG 8160 

GAAATAGCAA AATATAATGT TCACACAAAA GCACACAAAC GAAAATTGAT TAAAAGATGG 8220 

CTTATTGCAA TTGTCGTCTT AGCTATTATC ATCTGGGCAT TTGCAGGTGT ACCAAGTTTA 8280 

GAACTTAAAA GTAAATCATT AGAAATCTTA AAATCCATAT TCAGCGGATT ATTCCATCCT 8340 

GATATCAGCT ATATCTATAT ACCAGATGGC GAAGACTTAT TACGTGGTTT ACTTGAAACC 8400 

TTTGCGATAG CCGTTGTAGG TACTTTCATC GCCGCAATTA TCTGTATTCC ATTAGCATTT 8460 

CTAGGTGCAA ATAATATGGT AAAGCTACGC CCAGTTTCAG GTGTTAGCAA ATTTATTTTA 8520 

AGTGTTATAC GTGTCTTCCC AGAAATTGTA ATGGCACTTA TATTTATCAA AGCTGTTGGC 8580 

3S CCAGGTTCAT TTTCAGGTGT ATTAGCTTTA GGTATCCATT CCGTAGtATG CTTGGGAAAC 8640 

TTTTAGCTGA AGATATTGAA GGTCTAGATT TCAGTGCTGT AGAATCATTA AAGGCCAGTG 8700 

GTGCGAATAA GATTAAAACA CTCGTATTTG CAGTCATACC ACAAATTATG CCTGCCTTTC 8760 

40 TATCACTCAT ACITTATCGC TTTGAACTAA ACTTACGTTC AGCTTCTATA CTGGGGCTAA 8820 

TTGGGGCTGG TGGTATCGGG ACACCACTCA TATTTGCCAT TCAAACACGT TCTTGGGACC 8880 

GTGTAGGTAT TATATTAATC GGTTTAGTAC TAATGGTCGC AATTGTCGAT TTAATTTCCG 8940 

GTTCAATCCG AAAACGTATT GTTTAACATT AAATCAGGAT ACTCCTAAAT AAGAAGTCCT 9000 

ACCGTCTTAC GTTTCTCTAT TATAATAAAA ACAGCAGTGA AGAAAACTAT TGTTATAGTT 9060 

AACTTCACTG CTGTTTTTAT AATATCTAAA TTTATTCTAT TTCAATTCCT TTAAATAACT 9120 

TTTACCGAAC TCTGGTAATG TTACGTTGAA ATTATCTGCT ATAGTTGCAC CGATAGAACT 9180 

GAATGTAGTA TCACTTTCTA GTGCATGACC ACCTTTAAAT TTCGGACTGT ACATAATTAC 9240 



25 



30 



45 



SO 
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TGTAATAATT ACTAAATCGT CTTCTTTTAA 
GAAATCTTTA ATTGCTTGTG CATAACCTGG 

5 

AAAGTCTACT AAGTTTAAGA AGCTAATACC 
TTGATCCATA CCGTCCATGT TACTCTTCGT 
ATAAATGTCA TTAATTTTAC CGATGGCAAT 

10 

TAAGACAGTT TTACCAAAAG GTTTTAACGC 
GTTTCCTGGT TCACCAACAT ATGGACGTGC 

1S TTTTGTCAAC TCACGAACCT TTTCACAAAT 

TTCATGTGCA GCAATTTGCA ATACTGGGTC 
TTTCATTTGG TGCTCGCCCC ACTCATCGAT 

20 AACAACTTTA CGACCTGTCA TTTCTTCAAT 

AGGGTATACT TTAAAAGGTT GCATAATATT 
TGTATCTTTA CCAACTGAAG CTTCACTCAA 

25 TGCATTTACT ACTGGTAATT TATCGATGTT 

AGTTTGATCG AAACCTTCTA AGGTATGTCT 
AGCTGCGTCT GGCGCTTCAC CAATACCTAC 

30 

AAATGGTCTT GTCATAGCTA TCACTCCCAA 
TTCTAAACCT TGCATAATTT GAACACCTGC 

35 AACCATTTTA TTGAAATCTT CTAAATTACG 

AGCACCTACT GTATCTTTCA TTAATTTAAC 
ACCTGTTGAA GTTTTAACGA AGTCCGCACC 

40 AATTTCGTCA TGGTCCAACA ATACCGTCTC 

AGCTTTAACC ACTGCTTCAA TGTCTTGTTG 
GCCGATGTTG ATGACCATGT CAATTTCATC 

45 

AAATGCTTTC GTTGCAGTTG TCGACGCACC 
CACCTCTGAA TCAGCTAGTC GCTCTGCTGC 
AGATTTAAAA TTGTATGctT TCGCTTCATC 

50 

CTCAGGCTTC AATAAAGTGT GATCTATATA 
TGTTATATAA TCTCTTTATT TAATTTTACT 

55 



GTTGCTAAAC AGTTCTGGCA AGCGATCATC 93 60 

TTTATCACGA CGATGACCGT ATAATGCATC 9420 

TGTGaAATCT TTCTTAACAA TTTTCATCAA 94 80 

ACGAACCGCT TCTGTTACAC CTTCACCATC 954 0 

AACATCATAA CCACCGTCTT TCAAATGATC 9600 

ATAGTCATGT CGATTAGATG TACGTGTAAA 9660 

GATAATACGA CCAATTAAAT ATTTAGGGTC 9720 

ATCATATAAC TCTTCTAATG GGATAATGTC 9780 

TGCAGTTGTA TAAACAATTA AGTCACCAGT 9840 

AATTTGCGTA CCCGATGCCG GTTTGTTAGC 9900 

TTGTTGAATT AACTCTTCAG GGAATCCATT 9960 

TAATCCCATA ATTTCCCAGT GACCAGTCAT 10020 

TTTAGTATAG TATGCTTCTG GTTGTTCAAC 10 080 

CCCTAGACCT AACTTTTCAA GGTTTGGTAA 10140 

TAAAGTATGT GAACCTTCAT CTTTAAAATC 10200 

TGAATCCATT ACGATTAAAT GTACACGATT 10260 

AATTTATATA TATTAGTAAT CTGAATCTGC 10320 

GCTCGCACCA ATACGTGTCG CACCTGCTTC 10380 

TACGCCACCT GATGCTTTTA CTTCTACATC 10440 

GTCTTCTGCA GTCGCACCGC CACCTGCAAA 10500 

AGCCGCTTTT GTTAATTCAC TCGCTTTTAC 10560 

AATAATCACT TTTACTGTGT GACCTTTCGC 10620 

TACATCATCA AAACGTCCAT CTTTTAATGC 10680 

TGCACCATTT TGAATTGCAT CTTCTGTTTC 10740 

TAATGGGAAT CCTATTACCG TACAAACGAG 10800 

ATATTTAACA TGTGTTGGAT TCACACATAC 10860 

GATGATTTGA TCGATTTGCG TACGTGTTGA 10920 

TTTCTCAAAT TTCATACTTA CTACTCCTCG 10980 

ATAAATACGA ATATATCTCG CGAATTTATA 11040 
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ATACTCATTA AACCTAAAAT AATTAAAATA ATACCGAAAT GTGAACTTAA TGCATCATTG 11160 

CCTGGGAAAT TTAATGCTTT AAAATCGATT AGAGCCGCAG CAATCGCAAT ACCTACAGAT 11220 

ACCGCCACAT TAATAATTAA ATTATAAAAA CCAATAGCCA CACCTGTCAT ATTAAGATCT 11280 

ATTGTTTTAA TGGCTTCGTT AAGTAAAGGT GCATACATTA AAGCAAAGCT ACCTGCAAAG 11340 

AATATCATAG AAATGACGAA GATTGAAATG TGATTACCTA CTGCAAATGC AGGTAAAATC 11400 

AAGCTCAGTG CTATTAAAAT AATTGCTGTG ATAATCGCTT GTTTTGAATT CAGATATTCG 11460 

CCGATTTTAC CACTTAGTGC ACCAACAATG ACTGCTACTA TATAACCCGG TACTAATAAC 11520 

AGTGATGTTG TGTCTAGTTG CAGATGATAA ATTTGCTCCA TTATGAATGG GAACGTAAAA 11580 

ATATAACCCA ATTGGATAGC ATACATTACA AATACTATAA ATAAAAATGA AGCATAACGT 11640 

TTATTTTGGA AAAATGATTT ATTTACTAAT GGACGTTGCG CATTTTTAAT ATATAGCGCA 11700 

AAAACGATAA TCGCAATTAA GGCACCAATC ATATATAACC AATTAAAGTT CGTAATAAAC 11760 

AGCATGACTG TTGTAGCAGG GGATCCTCTA GAGTCGAnCC TG 11802 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 1196 base pairs 
(B) TYPE: nucleic acid 
<C} STRANDEDNESS : double 
(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

35 CTAAAGAAGA TGCGAAACAA GATGTTGATA AACAAGTTCA AGCTTTAATT GACGAAATCG 60 

ATCAAAATCC AAATCTAACA GATAAGGAAA AACAAGCACT TAAAGATCGT ATTAATCAAA 120 

TAC7TCAACA AGGTCATAAC GACATTAACA ATGCGATGAC AAAAGAAGCA ATTGAACAAG 180 

40 CAAAAGAACG TTTAGCGCAA gCATTGCAAG ACATCAAAGA TTTAGTGAAA GCTAAAGAAG 240 

ATGCGAAAAA TGATATTGAT AAACGTGTAC AAGCTTTAAT TGACGAAATC GATCAAAATC 300 

CAAATCTAAC AGATAAGGAA AAACAAGCAC TTAAAGATCG AATTAATCAA ATACTTCAAC 360 

45 

AAGGTCATAA CGACATTAAC AATGCGCTGA CTAAAGAAGA AATTGAGCAG GCAAAAGCAC 420 

AACTTGCACA AGCATTGCAA GACATCAAAG ATTTAGTGAA AGCTAAAGAA GATGCGAAAA 480 

5Q ATGCAATAAA AGCCTTAGCT AATGCGAAgc GTGATCAAAT CAATTCAAAT CCAGATTTAA 540 

CACCTGAGCA AAAAGCAAAA GCGCTCAAAG AAATTGACGA AGCTGAAAAA CGAGCACTAC 600 

AAAACGTTGA GAATGCTCAA ACTATAGATC AATTAAATCG AGGATTAAAC TTAGGTTTAG 660 
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TTGAAGCAAC ACCTGAGCAA ATCCTAGTTA ATGGTGAACT CATTGTACAT CGTGATGACA 
TCATTACAGA ACAAGATATT CTTGCACACA TAAACTTAAT TGATCAGCTT TCAGCAGAAG 
TCATCGATAC ACCATCAACT GCAACGATTT CTGATAGCTT AACAGCAAAA GTTGAAGTTA 
CATTGCTTGA TGGATCAAAA GTGATTGTTA ATGTTCCTGT AAAAGTTGTA GAAAAAGAAT 
TGTCAGTAGT CAAACAACAG GCAATTGAaT CAATCGAAAA TGCGGCACAA CAAAAGATTA 
ATGAAATCAA TAATAGTGTG ACATTAACAC TGGAACAAAA AGAAGCTGCA ATTGCGnAAG 
TTAATAAGCT TAAACAACAA GCAATTGGAT CATGTTnAAC AATGGCACCT GGATGTTCCA 
TTCAGTTGAA GGAAATTTCA ACAACAAGGA ACAAGCGCCn GATTGGAACA ATTTGA 
£2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1519 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



• (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



CAATCGTTTC 


AACGCTATTA 


TCTTTAGACA 


ACAATTGTAA 


GCGTGTATGT 


GCAGTTTCTA 


AACAGTCTAT 


AATTCGAGTT 


CTTAATTCAG 


CTGGATCATC 


TTTAAAAATA 


AAATCCATCG 


CTGCAACTTT 


GTAGACAAAT 


GTTAAATAGG 


TAAGTTCACT 


GTGACTCGTA 


ACGAAAATAA 


TGTTACCAAC 


TGGGTCATGC 


TTACGAATTT 


CACTGCCTAA 


TTTGATACCA 


TTAATATCAG 


TTGAAAGTTG 


AATATCTAAA 


AAGTAACAGC 


CTATGTCATT 


CATATTTTTA 


GCTTGCTCAA 


GCACCTCATA 


AGGATTATCA 


GTTGCGAGGG 


CAATTTCCAT 


AGGCTTTTCT 


TCTATCATTA 


TATAATTTTT 


AATAATGGTA 


ACCATGTTTT 


CTCTTTGTTT 


TGGATCGTCT 


TCGCAAATGA 


AAATTTTCAT 


ACATTCACAT 


CCTTATGGCT 


AGTTGTTAAT 


AATTTCAACT 


TTTTGAATAA 


AGAAACCATT 


TTCGATAATT 


GTATCTAATA 


AGACATTGTC 


TGCATTATCA 


GCAATTTCTT 


TTAAAGTTGA 


TAGACCTAAA 


CCACGACCTT 


CACCTTTAGT 


AGAAAAACTT 


TCTTGGAACA 


ATTCATGAAT 


GCGTGGTATA 


TCATCAGCGC 


ATTTATTCAT 


AACAATAAAC 


GTTACTGAAT 


TTTCACTTTC 


AATAAATGCA 


ACGCGAATGA 


TAGGGTCATC 


AATTTCAGTT 


GATGCCTCAA 


TTGCATTATC 


AAGAATAATA 


CCAATACTGC 


GACTTAAATC 


GATCATATTC 


AAGTTAATGC 


TACTTACTTC 


ATCGGGTATT 


TCGATACTAA 


TCGGAATATT 


CATTTCTTGT 


GCACGTAAAA 


TTTTCGCAGT 


AATTAAGCCT 


TTAATTTCAC 


GTACTTTAAG 


ATTCTCGATA 


CCATTTAATT 
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GTAGGCCAGG CATGTCATCT TCTCGAATGT ATTCTGAAAG TGTCGTTAAG ATATTGACAT 1020 

AATCATGACG GAACTTGCGC ATTTCGTTGT TGATAGCTTC AATCTTCAAT GTATATTCAT 1080 

AATAGGTTTC AATTTCTTCT TGATTACGTT TATATTTCAT CTCTTTAAGG AGAAATTGAG 1140 

AAATAACAAA TGTTAATATA CTTAAAAATA TAGTGATACC AATAAAAATA AAAGAATACT 1200 

GCCTTATTAC TTTAGCTTCA TCCGAGTTTA TTTGTGAATA AAAGAAAAAT AATGAAAAAG 1260 

TAAGCAGTAA GATAGTCGAA ATAACTATTA AAAATCCTTT GTTTAGTATT AGATATGGTG 1320 

TGCTAATTTT TTTGAGAACT CTATTTATTA TATATGAGAA TAGTATACTA ATAGTCACAT 1380 

AAACTACAAA AAAGCTAGGG AATATTACAA ATATACTATC AGAAATTTTG GTGGATATAT 144 0 

GCATATATAA CTATATACCT GTAGTTAGCA . CnGTnATAGG AATAATCnGG CGAGGTCCAT 1500 
AATCCACCAA AATAGAATA 
2Q (2) INFORMATION FOR SEQ ID NO: 73: 

(i), SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5445 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

GTAGGAATCT CTTTGTCTTT TTGGGAGGAC ATTTAATATG AATGTATATT TAGCAGAATT 60 

CCTAGGAACT GCAATCTTAA TCCTTTTTGG TGGTGGCGTT TGTGCCAATG TCAATTTAAA 120 

GAGAAGTGCT GCGAATGGTG CTGATTGGAT TGTCATCACA GCTGGATGGG GATTAGCGGT 180 

TACAATGGGT GTGTTTGCTG TCGGTCAATT CTCAGGTGCA CATTTAAACC CAGCGGTGTC 24 0 

TTTAGCTCTT GCATTAGACG GAAGTTTTGA TTGGTCATTA GTTCCTGGTT ATATTGTTGC 300 

TCAAATGTTA GGTGCAATTG TCGGAGCAAC AATTGTATGG TTAATGTACT TGCCACATTG 360 

GAAAGCGACA GAAGAAGCTG GCGCGAAATT AGGTGTTTTC TCTACAGCAC CGGCTATTAA 420 

GAATTACTTT GCCAACTTTT TAAGTGAGAT TATCGGAACA ATGGCATTAA CTTTAGGTAT 4 80 

45 TTTATTTATC GGTGTAAACA AAATTGCCGA TGGTTTAAAT CCTTTAATTG TCGGAGCATT 540 

AATTGTTGCA ATCGGATTAA GTTTAGGCGG TGCTACTGGT TATGCAATCA ACCCAGCACG 600 

TGATTTAGGT CCGAGAATTG CACATGCGAT TTTACCAATA GCTGGTAAAG GTGGTTCAAA 660 

50 TTGGTCATAT GCAATCGTTC CTATCTTAGG ACCAATTGCC GGTGGTTTAT TAGGTGCAGT 720 

GGTATACGCT GTATTTTATA AACATACATT TAATATTGGT TGTGCAATTG CrATTGTTGT 780 
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CGAATCAATT TACTAAAATA AAAAGAAACG TAAATAGCAT AATTTAACAT GTTTGATTCA 900 

TGGATTATGC TATTTTTTCG CCAAAATTTA ACAGATTTTG TACAATGGGT TAGCGATTAT 960 

TTTTTAATAA AGGAGATACT ACTAATGGAA AAATATATTT TATCTATAGA CCAAGGAACA 1020 

ACAAGCTCAA GAGCGATTTT ATTCAATCAA AAAGGGGAAA TTGCAGGGGT AGCACAACGT 1080 

GAGTTTAAGC AATATTTTCC ACAATCAGGT TGGGTTGAAC ATGATGCAAA TGAAATTTGG 1140 

ACATCTGTGT TAGCTGTAAT GACGGAAGTA ATTAATGAAA ATGATGTTAG AGCTGATCAA 1200 

ATTGCAGGTA TCGGTATTAC AAACCAACGT GAAACAACGG TTGTTTGGGA CAAaCATACT 1260 

GGCCGCCCAA TTTATCACGC AATTGTTTGG CAATCACGTC AAACACAATC AATTTGTTCA 1320 

GAATTAAAAC AACAAGGATA TGAACAAACA TTTAGAGATA AGACAGGATT ACTTTTAGAT 1380 

CCGTATTTTG CAGGTACAAA AGTTAAATGG ATTCTAGACA ATGTTGAAGG TGCACGAGAA 144 0 

20 AAAGCAGAAA ATGGCGATCT ATTATTTGGA ACGATTGATA CTTGGTTAGT ATGGAAATTA 1500 

TCaGGaAAAg CtGCGCATAT TACTGATTAT TCaAATGCGA GTCGTACATT AATGTTTAAT 1560 

ATCCATGATT TAGAATGGGA CGATGAGTTA TTAGAACTAt TACAGTACCT AAAAATATGT 1620 

25 TGCCAGAAGT TAAAGCTTCG AGTGAAGTAT ATGGTAAGAC AATTGATTAC CACTTCTATG 1680 

GTCAAGAAGT ACCAATCGCT GGAGTAGCTG GTGATCAACA AGCAGCATTA TTTGGACAAG 1740 

CTTGCTTCGA ACGTGGTGAC GTGAAAAACA CATATGGAAC TGGTGGCTTC ATGTTAATGA 1800 

ATACAGGTGA CAAAGCGGTT AAATCTGAAA GTGGTTTATT AACAACAATT GCTTATGGTA 1860 

TTGATGGAAA AGTAAATTAT GCGCTTGAAG GTTCCATCTT TGTTTCGGGT TCAGCAATCC 1920 

AATGGTTACG TGATGGATTA AGAATGATTA ATTCAGCACC ACAATCAGAA AGTTATGCGA 1980 

CACGAGTTGA CTCTACTGAG GGTGTTTATG TTGTTCCAGC TTTTGTAGGT TTAGGAACAC .2040 

CATAffTGGGA TTCTGAAGCA CGTGGTGCGA TTTTCGGTTT AACACGTGGA ACTGAAAAAG 2100 

AGCACTTTAT CCGTGCAACT TTAGAATCAC TATGTTACCA AACTCGTGAC GTTATGGAAG 2160 

CAATGTCAAA AGACTCTGGT ATTGATGTCC AAAGTTTACG TGTCGATGGT GGTGCAGTTA 2220 

AAAATAACTT TATTATGCAG TTCCAAGCAG ACATTGTTAA TACTTCTGTT GAAAGACCTG 2280 

45 AAATTCAAGA AACTACAGCT TTAGGTGCTG CATTTTTGGC AGGTTTAGCA GTTGGATTCT 2340 

GGGAGAGTAA AGATGATATC GCTAAAAACT GGAAATTAGA AGAAAAATTC GATCCGAAAA 24 00 

TGGATGAAGG CGAAAGAGAA AAATTATATA GAGGTTGGAA AAAAGCTGTT GAAGCAACAC 24 60 

50 

AAGTTTTTAA AACAGAATAA ACTTGTAGAT TAGACTTTTG TATAAACATT GTGATACAAT 2520 

CAATTTAAGT TAATATTTGA ATCGAGAAGC GAGAGATTTG TTCGAACATG TACAATTGAA 2580 
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GCATTGTCTA CTTTTAAGAG AGAACATATT 
TTAGTAATTA TTGGTGGCGG TATTACAGGT 
5 GGAATGAAAG TTGCATTAGT TGAAATGCAA 

ACAAAATTAG TCCATGGTGG TTTGCGTTAC 
GAAACTGGTA AAGAACGTGC GATTGTTTAT 

10 

TGGATGCTTT TACCAATGCA TAAAGGTGGA 
TTAGGAATGT ATGATCGTTT AGCAGGTGTT 

^ AAAAAAGAAA CTTTAGCTAA AGAACCATTA 
TACTATGTTG AATATCGTAC TGACGATGCG 
GCTGAAAAAG GCGCAGAAAT TATCAACTAT 

20 AATCAACAAG TAAATGGTGT TAAAGTTATA 

AAGGCTAAAA AAGTGGTTAA TGCAGCAGGT 
TATGCACGCA ATAATAAAAA ATTACGTTTA 

25 TCAAAATTCC CATTAGGTCA AGCAGTATAC 
TTTGCAATTC CACGTGAAGG AAAAGCGTAT 
ATCAAATCTT CACCATTAAC TACACAAGAA 

30 

TACATGTTCC CTAGTGTTAA TGTTACAGAT 
AGACCATTAA TTTACGAAGA AGGCAAAGAC 
TGGGAAGGTA AATCAGGTTT ATTAACTATT 

35 

ATGGCTCAAG ACATTGTTGA TTTAGTATCT 
TTTAGTCCAT GTAATACAAA AGGTCTGGCA 

40 AACTTTGATG CGTTTGTAGA GCAAAAAGTA 
GATGTTGCAA GACGTTTAGC ATCTAAATAT 
GCGCAAACAT CTCAATACCA TGATAGCAAG 

45 TATAGTATTC AACAAGAAAT GGTATACAAA 
AAAATGTATT TCAATATTAA AGATGTATTA 
GCAGATATGC TTGATTACTC TCCAGCTCAA 

50 

GCAATTAAAG AAGCGCAACA TGGaAATAAT 
ACAATCATAA ACTGGTGTCC TGTTTTAAGG 

55 



AAAAAGAATT TAAGAAATGA TGAATATGAT 2700 

GCAGGTATTG CACTAGACGC GAGTGAAAGA 2760 

GACTTTGCAC AAGGAACAAG CTCAAGATCT 2820 

TTAAAACAAT TCCAAATTGG AGTAGTTGCC 2880 

GAAAATGGGC CTCATGTTAC GACTCCAGAG 294 0 

ACATTTGGTA AATTCTCAAC ATCAATTGGT 3000 

AAGAAGTCTG AACGTAAAAA AATGTTATCT 3060 

GTTAAAAAAG AAGGTCTAAA AGGCGGCGGT 3120 

CGTTTAACTA TTGAAGTTAT GAAGCGTGCT 3180 

ACTAAATCTG AACACTTCAC TTATGATAAA 3240 

GATAAATTAA CTAATGAAAA TTATACAATT 3300 

CCATGGGTTG ATGATGTTAG AAGTGGTGAT 3360 

ACTAAAGGTG TACATGTTGT TATTGATCAA 3420 

TTTGATACTG AAAAAGATGG AAGAATGATT 3480 

GTAGGTACTA CAGATACATT CTATGACAAT 3540 

GACAGAGACT ATTTAATCGA TGCGATTAAT 3600 

GAAGATATTG AATCAACATG GGCAGGAATT 3660 
CCTTCTGAAA TCTCTCGTAA GGATGAAATT . 3720 

GCAGGTGGTA AATTAACAGG CTATCGTCAC 3780 

AAACGCTTGA AAAAAGACTA CGGTTTAACA 3840 

ATTTCAGGTG GCGATGTAGG TGGTAGCAAG 3900 

GATGTAGCTA AAGGATTCGG CATTGATGAA 3960 

GGTTCAAATG TTGATGAATT GTTCAACATT 4020 

TTACCATTAG AAATTTATGT AGAACTTGTT 4 080 

CCTAACGATT TCTTAGTTCG TCGTTCTGGT 4140 

GATTATAAAG ATGCTGTCAT CGATATTATG 4200 

ATTGAAGCAT ATACTGAAGA AGTTGAGCAA 4260 

CAACCAGCAG TTAAAGAATA AtTAATTTGT 4320 

GCATCAGTTT TTTTATACGA GATACATTAG 4380 



518 



EP0 786 519 A2 

GTTATTAAAG GTGTGAGATG ATGACTGAAA AACAATTTAA ATTAACTGTA CAAGATAATA 4500 

CGAATATTGA AGTTAAAGTG AATTTTACAG ATGTAGATTC AAAAGGAATT ATT CAT AT AT 4560 

5 TTCATGGTAT GGCTGAACAT ATGGAACGTT ACGATAAATT AGCACATGCA CTTTCAAAGC 4620 

ATGGCTTCGA TGTGATACGT CATAATCATC GAGGACATGG TATTAATATT GATGAATCAA 4680 

CAAGAGGGCA TTACGATGAT ATGAAACGAG TTATCGGTGA TGCCTTTGAA GTAGCGCAAA 4740 

10 

CAGTGAGAGG CAATGTTGAT AAACCATACA TTATAATCGG ACATTCAATG GGATCCGTTA 4300 

TAGCTAGATT GTTTGTAGAA ACATATCCGC AATATGTTGA TGGTCTAATT TTAAGTGGTA 4 860 

CTGGTATGTA TTCATTATGG AAAGGTTTAC CAACCGTTAA AGTGTTACAA CTGATTACAA 4920 

15 

AAATTTATGG TGCTGAGAAA CGAGTTGAAT GGGTTAACCA GTTAGTATCA AATAGTTTTA 4 980 

ATAAAAnnAT ACGTCCATTA CGTACACAAA GTGATTGGAT TTCTAGTAAT CCAATTGAAG 5040 

20 TAGATAaCTT TATTAAAGAT CCATATAGTG GaTTTAATGT GTCAAATCAA TTATTATATC 5100 

AAACAGCCTA TTATATGCTA CATACATCAC AATTAAAAAA TATGAAAATG TTAAaTCATG 5160 

CCATGCCTAT ATTATTAGTT TCAGGATATG ACGATCCTTT AGGTGATTAT GGTAAAGGGA 5220 

25 TTTTAAAATT GGCGAATATA TATAGAAACG CTGGCATnAA AAATGTTAAA GTGAATCTTT 5280 

ATCATCATAA ACGTCATGAA GTGTTATTTG AAAAnGATCA TGACriAAATT TGGGAAGACT 534 0 

TGTTTAAATG GTTGAATCAA TTTTATAAAA AATAAAGAAA GTGGAATTAA ATATGAATAA 54 00 

30 

AAATAAGCCT TTTATTGTAG TAATTGTGGG GCCAACTGCT TGCAG 5445 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 2569 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
* (D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
TGGCTTGAAC TACGCCAATA AGTCCCCCTA GTACAAGAAT GAATACCATG ATATCGACCG 60 
45 CTTCTATCGT ACCTTCAACC ATGCTACTTG TTATTTGTTC TGGTCCAGCT GGATGTTGCT 120 

TTAATCTTTC ATAAGTATTC GGAATTGATA CCGGCTTATT AATTGCACCT GATTTAAATT 180 
GTTCAATCTT AATTTTAACC CCCATTTTGT CTAGTTCCTG TTGCGTACCC GGAACCTTTT 240 

50 

TCACTTGGTT ATGAGGGTTA ACTATCTTTA GTTCTTGGGA TGAAGGTTCG TAAGAAAGTT 300 
TAGAATATGC ACCAGCAGGA ATAACCCATG TTGCTATAAC TGCAACAACC GTTAAAATGA 360 
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15 



TAATTGTATT TTCCACGGTT TCATCTCCTT CGACATTTAA CCTAGCATTT CTAC CTT AAA 4 80 

GATTTTATAA ATATAAATTA AGAAAGTGCA CCCCGCATCA AAATAGAGGC ATTATTTTCA 54 0 

GGGGGTGCAC ATAAATAATA AAAATCATGC ATTTGACATA TAGTAATTGA AAAGCGTTTC 600 

AATTCAATTA CTTTTTAATC ACAGTACCTA CTTTACCCTC TAAGGCAGCA TCTAATTCAT 660 

TTAATGATGT TATAAGCACA CTTCCTTTTG GATTGTTTTC AATAAATGAT ATGGCTGCTT 720 

CAATTTTTGG TAACATACTT CCTTTTGCAA ATTGATTTTC GTCTATATAT CGTTTTAATT 780 

CATCAACATT TGTTGTTTTC AAAGGCTGTT GGTTTTCAGT GTTAAAATTA ATATATACAT 84 0 

AATCAATTGC TGTTAAAATA ATCAATTGAT CGCATTGAAT ATTAGCACCC AACAACGCAC 900 

TTGTTTTATC TTTGTCTATA ACTGCATCAA TACCTTTAAA ACCATCATGT TGCTCTCTAA 960 

TTACTGGTAT ACCTCCACCA CCAGCAGCAA TAACGAGTGT ATCATTTTTA ATAAGTGTTT 1020 

20 TAATACTCTC TAATTCAATA ATAGAGATGG GTTGTGGTGA AGGAACAACG CGTCTATATC 1080 

CTCTTCCAGC ATCTTCAACA AATATAAATC CTTTTTCTTT TTGAATTTGT TCAGCTTCTT 1140 

CTTTGTTGTA AAATAACCCA ATTGGTTTTG AAGGATTGTT AAATGCCGGA TCATTTTCAT 1200 

CAACTTCAAC TTGTGTCACT AGTGTTACCA CTTGTTTATC CATTCCAATA GAATGCAATT 1260 

CATTTTGTAA GCTTTCTTGT AATTGATAGC CGATGTAAGC TTGACTCATT GCGCCACATT 1320 

CAGCAAATGG AAATGCCGGA CCTTGGTTAT GTTCTGCAGC ATAGTTAAGT CCCAAATTAA 1380 

TGCTTCCAAC CTGTGGTCCA TTACCATGAC TAATAACAAT CTCATGTCCT TTTGTnATTA 1440 

AyCCTACTAA TGATTtCGCA GTATTTTTAA CAAGCTCGAG TtGgTyCTTG aGGTGATTTn 1500 

CCTAAAGCAT TACCACCTAA TGCTACTACT ATTTTCGCCA TCATATTCAC TTCCTTATAT 1560 

CATTTAAAAT TCACCCAATG TAGCAACCAT GaCTGCTTTG ATTGTATGCA TTCTGTTCTC 1620 

AGCTTCTTGG AATACAACTG AAGCTTTACT TTCGAATACT TCATCTGTAA CTTCCATTTC 1680 

40 TCGAATACCA TATTTTTCAA AAATTTGTTG ACCTATTTTC GTATCAGCAT TATGGAAAGA 1740 

TGGTAAGCAA TGCTCAAAAA TAACATTTGG ATTACCAGTT TTATCCATTA TTTCTTTATT 1800 

TACTTGATAT GGTTTCAATA ATTCAAGTCG TTCTTTCCAT ACTTCATCAG GTTCACCCAT 1860 

TGATACCCAA ACATCAGTGT AAATTACATC CGAACCTTTT ACaCCTTGGT CaATATCATC 1920 

TGTGATTAAT ATGTTGCCaC CATTTTCaGC GGCAATATTT TTACAGCGAT TTAATAATTC 1980 

ATCTGTTGGA TTTAATTCTT TTGGACAAAC TAAATGGAAG TTCATACCCA TAATGGCAGC 204 0 

" ' ACCTTGCATT AATGCATTTG CAACGTTATT ACGACCATCT CCAACATATG TAAAGTTAAT 2100 

ATCTGCATAA TCTTTTTTTA AGACTTCTTT TGCTGTTAAG AAATCAGCAA GAACTTGAGT 2160 
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TTCTACTGTT CTTTGTGAAA AACCACGGTA TTCAATGCCA TCATACATTC CACCAAGCAC 2280 

ACGTGCAGTA TCTTTAGTTG TTTCTTTTTT ACCCATTTGT GATCCAGTTG GGCCTAAATA 2340 

5 AGTTACATTT GCACCTTGAT CATGCGCTGC AACTTCAAAT GCACATCGCG TTCTTGTAGA 2400 

ATCTTTTTCA AATAACAGTG CAATATTTTT ATTTTTTAAC ATAGGCTTTT CAGTGCCAAT 2460 

ATATTTAGCA CGTTTTAAAT CCTCGGAGAG TGTTAATAAG GTTCTACCTC TTGTCGTGAA 2520 

10 

AAGTCTAATA AAGTTAAAAA ACTTCTGTTT CGTAnATTTT TCATTAAnA 2569 

(2} INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 1273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
CCTGGAACCA TCCaATCGtG CaAATCtTGa AAGaGAATAC GCAACAACAA TTAAATGTAT 60 

25 TGGAACACTA TATTCCAAAT GACCATCCAG CACTCGTTGA ATTAAAAATA TGGGAACGTT 120 

GGTTACATAA ACAAGGTTAC AAAGACATCC ATTTAGATAT TACTGCGCAC CACCTAGATC 180 
CTATTACACA GGTTTATTTA TTCAATGTCA TTTTGCTGAA AATGAATCTC GAGTTTTAAC 24 0 

30 AGGTGGTTAT TACAAAGGAA GCATCGAAGG GTTTGGATTA GGATTAACAC TTTAAGTAAG 300 

GGAGTATGCA CAATGTTAAG AATCGCCATA GCCAAAGGAC GTCTAATGGA TAGTTTAATT 360 
AACTATTTAG ATGTAATTGA ATATACGACA TTATCAGAAA CATTAAAAAA TAGAGAACGC 420 

35 

CAATTATTAT TAAGTGTAGA TAATATTGAA TGCATTTTAG TAAAAGGAAG TGACGTGCCA 480 
ATCTATGTGG AACAAGGAAT GGCAGACATA GGCATTGTTG GTAGCGACAT ATTAGATGAG 540 
CGCCAATATA ATGTTAATAA TTTGTTGAAT ATGCCTTTTG GAGCATGTCA TTTTGCGGTT 600 

40 

GCAGCGAAAC CTGAAACGAC CAATTATCGT AAAATCGCAA CGAGTTATGT TCATACTGCT 660 
GAAACATATT TTAAATCAAA AGGTATTGAT GTCGAATTGA TTAAATTGAA TGGCTCTGTT 720 
45 GAATTGGCCT GTGTTGTAGA TATGGTAGAC- GGAATTGTCG ACATCGTTCA AACAGGTACT 780 

ACGCTAAAAG CGAACGGACT GGTTGAAAAG CAACATATTA GTGATATCAA TGCAAGATTA 84 0 

ATAACTAATA AAGCAGCTTA TTTTAAAAAA TCACAATTAA TAGAGCAATT TATTCGCTCT 900 

50 

TTGGAGGTGT CTATTGCCAA TGCTTAATGC ACAACAATTT TTAAATCAAT TTTCATTAGA 960 

AGCACCATTA GATGAGTCAT TGTATCCaAT TATTCGCGAT ATTTGTCAGG AAGTTAAAGT 1020 
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TTTAGaAATT AGTCATGAmC AAATTAAAGC AGCATTTGAC ACATTAGATG AAAAAACAAA 1140 

ACAAGCATTA CAACAAAGTT ATGAAAGAAT TAnAGCATAT CAaGAAaGTA TtaAACAGaC 1200 

GaATCAACAG TTAGAAGaAT CAGTGGaGTG tTrTGaAATA TACCATCCmC taGaAAGTGT 1260 

CGGTATTTAT GTG 12 73 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 1308 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

20 GTTGATAAAT TAAAAATGTT TTTATCAGAT ATTCAAAGTT ACCAACAATA TAGTAAAGAT 60 

CATCCGGTGT ATCAGTTAAT TGATAAATTT TATAATGATC ATTATGTTAT TCAATACTTT 120 

AGTGGACTTA TTGGTGGACG TGGACGACGT GCAAATCTTT ATGGTTTATT TAATAAAGCT 180 

25 ATCGAGTTTG AGAATTCAAG TTTTAGAGGT TTATATCAAT TTATTCGTTT TATCGATGAA 24 0 

TTGATTGAAA GAGGCAAAGA TTTTGGTGAG GAAAATGTAG TTGGTCCAAA CG AT AATGTC 300 

GTTAGAATGA TGACAATTCA TAGTAGTAAA GGTCTAGAGT TTCCATTTGT CATTTATTCT 360 

GGATTGTCAA AAGATTTTAA TAAACGTGAT TTGAAACAAC CAGTTATTTT AAATCAGCAA 420 

TTTGGTCTCG GAATGGATTA TTTTGATGTG GATAAAGAAA TGGCATTTCC ATCTTTAGCT 480 

TCGGTTGCAT ATAGAGCTGT TGCCGArAAA GAACTTGTGT CAGAAGAAAT GCGATTAGTC 54 0 

TATGTAGCAT TAACAAGAGC GAAAGAACAA CTTTATTTAA TTGGTAGAGT GAAAAATGAT 600 

AAATCATTAC TAGAACTAGA GCAATTGTCT ATTTCTGGTG AGCACATTGC TGTCAATGAA 660 

CGATTAACTT CACCAAATCC GTTCCATCTT ATTTATAGTA TTTTATCTAA ACATCAATCT 720 

GCGTCAATTC CAGATGATTT AAAATTTGAA AAAGATATAG CACAAATTGA AGATAGTAGT 780 

CGTCCGAATG TAAATATTTC AATTGTGTAC TTTGAAGATG TGTCTACAGA AACCATTTTA 840 

45 GATAATGATG AATATCGTTC GGTTAATCAA TTAGAAACTA TGCAAAATGG TAATGAAGAT 900 

GTTAAAGCAC AAATTAAACA CCAACTTGAT TATCGATATC CATATGTAAA TGATACTAAA 960 

AAGCCCTCAA AACAATCTGT TTCTGAATTG AAAAGACAAT ATGAAACAGA AGAAAGTGGC 1020 

50 

ACAAGTTACG AACGAGTAAG GCAATATCGT ATCGGTTTTT CAACGTATGA ACGACCTAAA 1080 

TTTCTAAGTG AACAAGGTAA ACGAAAAGCG AATGAAATTG GTACGTTAAT GCATACAGTG 114 0 

55 
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GATGGATTAA TCGATAAACA TATTATCGAA GCAGATGCGA AAAAAGATAT CCGTATGGAT 
GAAATAATGA CATTTATCAA TAGTGATTAT ATTCGATATT GCTGAAGC 
(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1431 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
GATGCCATTn ATnnGTATGC AAGAAGTTGT TCCGGGTTCA GGTGGATTaC CAGTTGGTAC 
TGGTGGTAAG ACGTTACTAA TGCTTTCAGG CGGTATAGAC TCACCAGTTG CTGGGATGGA 
AGTGATGAGA CGTGGCGTAA CAATTGAAGC GATTCATTTC CATAGTCCAC CATTTACAAG 
TGATCAAGCA AAAGAAAAAG TTATTGAATT GACACGTATT TTAGCTGAAC GTGTTGGACC 
AATTAAATTG CATATTGTAC CATTTACAGA ATTGCAAAAA CAGGTAAATA AAGTTGTACA 
TCCAAGATAT ACAATGACTT CAACGAGACG TATGATGATG CGTGTTGCTG ATAAATTAGT 
ACATCAAATA GGGGCTTTAG CTATTGTAAA TGGTGAAAAC CTAGGGCAGG TAGCCAGTCA 
AACACTTCAT AGCATGTATG CAATTAATAA TGTAACTTCT ACTCCTGTAT TACGTCCTTT 
ATTAACTTAC GATAAAGAAG AAATTATTAT TAAATCGAAA GAAATTGGTA CATTTGAAAC 
ATCTATTCAA CCATTTGAAG ATTGTTGTAC AATTTTCACC CCTAAAAATC CAGTAACCGA 
ACCAAACTTT GATAAGGTAG TCCAATATGA AAGTGTCTTT GATTTTGAAG AGATGATTAA 
TCGTGCTGTT GAAAATATTG AAACACTTGA AATAACTAGT GATTATAAAA CTATTAAAGA 
ACACJpAAACA AACCAATTAA TAAACGACTT TTTATAAATA AAATCCTAGA GTAAATTTAA 
ACATAAGGGG ATGTTAAACT ATGGATTTGA ACTTAACGAT GATTATAATC ATAATTTTAT 
TTGGTTTTAT CGCGGCGTTT ATAGATTCGG TTGTAGGGGG TGGCGGTTTA ATTTCTACGC 
CAGCATTATT AGCAATCGGT CTACCACCAT CTGTGGCTTT AGGTACAAAT AAATTGGCAA 
GTTCGTTTGG TTCTTTAACT AGTACGATAA AGTTTATAAG GTCCGGTAAA GTGGACTTAT 
ATGTTGTTGC CAAATTATTT GGTTTTGTAT TTTTGGCATC TGCATGTGGC GCATATATTG 
CAACGATGGT TCCGTCACAA ATATTGAAAC CTTTAATCAT CATTGCACTT TCGTCGGTGT 
TTATATTCAC ATTACTTAAA AAAGATTGGG GCAATACACG CACGTTTACT CAATTTACAT 
TTAAGAAAGC CATAATATTT GCAGCACTTT TTATATTAAT CGGCTTTTAT GATGGATTTG 
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TAAGTGCAGC AGGAAATGCT AAAGTTTTGA ACTTTGCTTC TAATATAGGT GCGCTTGTAT 1380 
TATTTATGGT ATTAGGACAA GTAGATTATG TAATAGGTTT AATTATGGCT A 1431 

5 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4403 base pairs 

(B) TYPE: nucleic acid 

10 (C) STRANDEDNESS: double 

<D) TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 



20 



25 



30 



35 



40 



45 



50 



AATATTATTT 


TAAATTCAAT 


ATTTATTGGT 


GCATTTATTT 


TAAACTTATT 


ATTCGCCTTT 


60 


ACCATTATTT 


TCATGGAAAG 


ACGTTCTGCC 


AATTCTATCT 


GGGCTTGGTT 


ACTAGTCTTA 


120 


GTTTTCTTGC 


CTTTATTCGG 


CTTCATTTTA 


TACTTACTAT 


TAGGACGACA 


AATTCAACGT 


180 


GACCAAATTT 


TCAAAATTGA 


TAAGGAAGAT 


AAAAAAGGAT 


TAGAGTTAAT 


CGTTGATGAG 


240 


CAATTAGCTG 


CTTTAAAAAA 


TGAAAACTTT 


TCAAATTCCA ATTATCAAAT 


TGTAAAATTT 


300 


AAAGAAATGA 


TTCAAATGTT 


GTTATATAAT 


AACGCAGCAT 


TTTTAACAAC 


AGACAACGAT 


360 


TTArrrrtAT 


ACACAGACGG 


CCAAGAAAAA 


TTTGATGACC 


TAATACAAGA 


CATCCGTAAT 


420 


GCTACTGATT 


ATATTCATTT 


TCAGTACTAT 


ATTATTCAAA 


ATGATGAATT 


AGGTCGTACC 


480 


ATTTTAAATG 


AACTTGGTAA 


AAAAGCGGAA 


CAAGGTGTAG 


AAGTTAAAAT 


TCTTTATGAT 


540 


GACATGGGTT 


CTCGTGGACT 


GCGTAAAAAA 


GGCTTACGCC 


CGTTTCGCAA 


TAAAGGTGGA 


600 


CATGCTGAAG 


CATTTTTCCC 


ATCAAAATTA 


CCTTTAATTA ACTTGCGTAT 


GAACAATCGA 


660 


AACCATCGAA 


AAATTGTTGT 


AATAGATGGG 


CAAATTGGAT 


ATGTTGGTGG 


TTTTAATGTT 


720 


ggtg&tgagt 


ACTTAGGTAA 


ATCAAAAAAA 


TTCGGCTATT 


GGCGAGATAC 


GCATTTACGA 


780 


ATTGTCGGGG 


ATGCAGTGAA 


TGCATTGCAA 


TTACGATTTA 


TTCTAGATTG 


GAATTCACAA 


840 


GCCACACGTG 


ACCACATCTC 


CTATGATGAT 


CGTTATTTCC 


CAGATGTAAA 


TTCTGGTGGA 


900 


ACAATTGGCG 


TTCAAATAGC 


TTCTAGTGGT 


CCTGACGAAG 


AATGGGAACA 


GATTAAATAC 


960 


GGCTATTTGA 


AAATGATTTC 


ATCTGCTAAA 


AAATCGATTT 


ATATTCAATC 


TCCCTATTTC 


1020 


ATACCTGATC 


AAGCCTTTTT 


AGATTCTATT 


AAAATTGCGG 


CATTAGGTGG 


TGTTGATGTC 


1080 


AATATCATGA 


TTCCTAATAA 


ACCTGACCAT 


CCGTTTGTTT 


TTTGGGCTAC 


TTTAAAAAAT 


1140 


GCAGCATCCT 


TATTAGATGC 


CGGTGTTAAA 


GTATTTCACT 


ACGACAATGG 


CTTTTTACAC 


1200 


TCAAAAACAC 


TTGTTATAGA 


TGATGAAATT 


GCAAGTGTGG 


GAACAGCTAA 


TATGGACCAT 


1260 
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AAATTAAAAC AAGCTTTTAT AGATGATTTA 
TATGCTAAGC GAAGTCTTTG GATTAAATTT 
5 ATCTTATAAA ATAGAAATAT GAGGAGTGTA 

GCTGCCAAAA AATATATGGA ATCTATTCAT 
CATGTATATC GTGTCACTGC TTTAGCTAAA 

10 

ACTTTAGTCA TTGAACTCGC ATGTTTGCTT 
GCTAACAAAC AATATGTTGA ATTGAAGTCA 

1S GATCAAGAGC ACATTTTATT TATTATTAAT 

CATGTCACTT TATCTTTAGA AGGTCAAATT 
GGCGCTATAG GTGTTGCACG AACATTTCAA 

20 ACAGAACATA TGTCACTAGA TAAGATTAAT 

GCAATTAAAC ATTTCTTTGA AAAATTACTT 
GCGAAGATGA TTGCTAAAGA ACGTCACGAC 

25 ACGGAATGGA ATTGTCACGA CTAGACATTG 

CGTGTTGTTG TGGAAGCTTG GTGTCATGCC 
TGGTGACATG TCATGCTACT TTGATGTGCT 

30 

TGATGTGGCA TTGCGGTGTT ATGGTGTTAT 
TTGATGTGCT GGTACCACGA TGCGACTTGA 
ATGGTGTTAT AGACCGGTTT GATGTTGATG 

35 

TGCGACTTGA TGTAGTGCTA TGATGTGGCG 
GGTGTTGATG TCATGCCGTT ACGATTCTAT 

40 TATGCCGTTG TGACGTTATT ATTTCACACT 
TTTGCGACAT ATACTGCTAC ACTGATGAAT 
ATGACAACTC TGTTATTAAC CACTTTTTAC 

45 TAAAAACAGC AGTAGGATGA CTTTCACATT 
CACATATTGT ATAATGTGAC ACTAAGTTTC 
ATAAAGTTAA AATTATCTTC AACTTTTAGG 

50 

TTCTTTTTCT TTTTAGACAC AACTTGTGTG 
TGCTCTCTTT CATACGCTTC AATGAAAGGT 

55 



GCAGTATCTT CTGAATTAAC AAAAGCACGT 1380 

AAAGAAGGTA TTTCACAATT ATTGTCACCT 1440 

aCTTTAATGC AACAATCAGA CGTCATTAGT 1500 

CAAAATGATT ATACAGGCCA TGATATTGCG 1560 

TCAATCGCTG AAAATGAAGG TGTTAATGAT 1620 

CATGATACCG TTGACGAAAA AGTTGTAGAT 1680 

TTTTTATCTT CTTTATCACT ATCAACCGAA 1740 

AATATGAGCT ATCGCAATGG CAAAAATGAT 1800 

GTCAGGGATG CAGATCGTCT TGATGCTATA 1860 

TTTGCAGGAC ACTTTGGTGA ACCTATGTGG 1920 

GATGATTTAG TTGAACAGTT GCCACCATCT 1980 

AAGTTAGAAT CTTTAATGCA TACAGATACG 2040 

TTTATGATGA TGTACTTGAA ACAGTTTTTT 2100 

AAGTTGTAGT ATGATGATGC GATGTAATGG 2160 

ATGTTACTTT GATGTGTTGT TGTGGGAGCT 2220 

GGTACCACGA TGCGTCTTGA TGTAGTGCTA 2280 

AGACAGGTTT GGCGTTGATG CCATGTTACT 2340 

TGTAGTGCTA TGATGTGGCA TTGCGGTGTT 2400 

CCATGTTACT TTGATGTGCT GGTGCTACGA 2460 

TTGCGCTGTT ATGGTGTTAT AGCCAGGTTT 2520 

GATATGTTGT TGGGACGTTG CAATGTGTAT 2580 

GTTACATGTA TAAGTGAATT GCTGTGGAAA 2640 

CATTGTGTCA AGATGACATT GCGATGAAGA 2700 

ATACTGAAAA CTCGTTAATA TTATTTCAAA 2760 

TGAAATCATC TTACTGCTGT TTCTATTTAT 2820 

GCTATTGAAG CGAAAAATAA TGTGCGCCCT 2880 

GTGCACATTA TTTGGACTTG CTAAGGTTAT 2 940 

TTTTTGCCTT TTTTATTGct GCCGCCGTTG 3000 

TGTACTTCTT TTTTAGCGAC TTTTTCATAA 3060 
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CCAAGTGCTG ATGCTGAGCT TAATGAAATC CAGATAATCA TAATTGGTGA AATGACCATC 

ATCATGTAAC CCATTTGACG TTGTTCGTCT GGCATCGTTT TACTTGATAC ATATGCTTGG 

ATAAAGTATA AAACACCGGC AATAATTGTA ATCCAAATAT CAGGACGTCC TAAATCGAAC 

CATAAGAAGT GTGGATATTT AAACAAACCA TCTACAAGTT GGTCTTTAAG TACAAAGTAT 

AATCCCATGA TGATTGGTAA TTGGATTAGC ATTGGTAAAC AACCCAACAT ACTCTTAATC 

GGGTTCATGT CATACTTTTT ATATACTTGC ATTAATTCTT GGTTTGCAGC CATTTTTTCT 

TCTTGTGTAC GCGnCaCGTT cACTTTTTCT TGAATTTTTT CAACTTCTGG CTTTGCAACT 

TTCATTTTTT GACGCATCAT ATGACTATTT TTATAGTTTG ACAACATGAA TGGTAATAAA 

ATAATACGAA TTACCAATAC AAGGATAATA ATAGCTAAAC CATAATTGTC GTTTAATAAG 

TTATTTCCCA ACCAATCCAA TACATTTTTC ATTGGATCTA CGAATGTATT GTAGAAAAAy 

cwCtACGTTT TTCAGGTTTA GAATAGTCAC AACCAGCCAA AAAGACCATA ATACCTAAAA 

ATAATGGTAG TAACGCTTTT TTCTTCATTT TTCCACCTCT ATCATTATAT TCACATAGGA 

TTTATTCTAT CACATTAATG AGTACGTATG AAACAATAAG TGGAAAAATT TAACTAATTA 

TTAAAAAAAT CTTTGAATCG ATTAACAGTC TTTTCAATAT TTTCACTTTT AGAAATGGCT 

GAAATGACTG AAATTCCATT GGCACCTGCT TCTACAATCG GCGCCACATT ATTAGTATTG 

ATACCGCCAA TAGCTACAAT CGGTAGTTGC GGATTCATTT CTTTAAACGT TGCAATCATT 

TCTGGACCTA CTGGTATATG CGCGTCATGC TTCGACGGCG TAGGATAGAT TGGTCCAACA 

CCTATATAAT CmACATGAGT TAAATCAGAT TTTGCATACT CATCTAAATC ACTAATACTA 

AGTCCAATAA TTTTATCAGT GAAATATTGT GCTATCTCTT TGACTTTCGC ATCATCTTGA 

CCGACATGTA TACCATCCGC GTTAATTTCT TTTGCCAAGG ATA CAT CATC ATTAACGATA 

AAAGGCACAT CATATTGATG ACAGAGATGC TGTAATTCTT TAGCTAATAC AAGTTTATCG 

TTTCtTTTTA AAGCTGATTC ACC 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

,(A) LENGTH: 1808 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
TGGAnCCAAT ATTAGAAATG ATTAAAACAT TAACAGGTAT TAATAGTCCT TCAGGAGnCA 
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TAACAAATAA AGGTGCGTTA TTAATAACAG TGCCAGGCAA AAATGATGAA GTACAACGCT 180 

GTATTACTGC TCATGTTGAT ACTTTAGGTG CaATGGTTAA AGAAATTAAA GAAGATGGTC 240 

5 

GCTTaGCAAT AGAATTAATT GGAGGATTCA CGTATAACGC GATTGAGGGT GAATATTGCC 300 

AAATTAAAAC TGATGCTGGT CAAATATATA CAGGAACAAT TTGTCTGCAT GAAACAAGTG 360 

TTCATGTATA TAGAAATAAT CATGAAATAC CTAGAGATCA AAAGCATATG GAAATAAGAA 420 

10 

TTGATGAAGT AACTACATCA GAAGAAGATA CAAAGAGTTT AGGTATTTCA GTAGGTGATT 480 

TTGTTAGCTT TGATCCACGT ACAGTTATCA CGTCATCAGG TTTTATTAAA TCTCGTCATT 540 

75 TAGATGATAA AGCTAGCGTA CGgTtGATAC TACAATTACT AAAGAAATTA AAAGAAGAGC 600 

AAATAATATT ACCACATACA ACGCAATTTT ATATTTCTAA TAACGAAGAA ATAGGTTACG 660 

GTGCAAATGC ATCAATTGAT TCGAAAATCA AAGAATATAT TGCATTAGAT ATGGGCGCGT 720 

20 TGGGAGACGG TCAAGCATCG GATGAATATA CAGTTTCTAT TTGTGCCAAA GATGCTTCAG 780 

GTCCATATCA TAAGCAATTG AAATCGCACC TAGTTAATCT TTGCAAAATA AATAACATTC 840 

CATATAAAGT AGACATATAT CCATATTATG GTTCAGATGC TTCAGCAGCT TTACATGCTG 900 

25 

GTGCGGATAT CAGACATGGT TTATTTGGCG CTGGCATTGA ATCATCTCAT GCAATGGAAC 960 

GAACACATAT TGATTCTATT AAAGCGACAG AGAAATTACT ATATGCATAT TGCTTATCAC 1020 

CAATTGAGTA AACAATTAGT GTTGACAAAT GTGaACGACC TATGTAATAT AATGAACTAT 1080 

30 

AAAAATAATT AGAATTTTCT AAAGAAATAG TAGCAGATAT GAAACGTAGC AAATAGAAAG 114 0 

CTAATGGGTG ATGGGAATTA GCACGCCATA TCTTGTGAAT TGGACTTTGG AAAACAATTG 1200 

35 AATGAGTTTT GAAAGTGAAC ATGAATTATG TTAACTAAGG TGGCACCACG GTAACGCGTC 1260 

CTTACAGGTA TATGCGTTAT GTGGTGTCTT TTTATTTAGA CAAAATGTAG TAGTTAATTA 1320 

AAGGTAGCAA CAGAAAGTTA GTGGATGATG TGAACTAACA CCGAGATTAA TGAAATTGGG 1380 

40 TTTTGTCTGC AACAGAAAAA TTATATATAG TAAAGAGTGA ACTATGAATA TTTCGAATAT 1440 

TCGGTTAATT TAGGTGGTAC CACGCGTCAC nTCCTTTATA TTGATAAGGA TGCTGGCGCT 1500 

TTTTTGAAAG GAGCGTATAG AATGGATATA TTTTATAAAA AAATAAAAGC AAATGTAACG 1560 

45 

CCCGAAGTTT TAGCACAACT TCATTCCAAG AAGaTCATTT TGGAAAGTAC AAATCAACAA 1620 

CAAACTAAAG GTCGCTATTC AGTTGTTATT TTTGATATTT ATGGCACTTT AACTTTAGAT 1680 

AATGATGTAT TATCAGTAAG TACTTTAAAA GAATCGTATC AAATCACTGA AAGACCGTAC 174 0 

50 

CATTATTTAA CGACTAAnAT AAATGAAGAC TACCATAATA TTCCAAGATG AGGCAACTTA 1800 

AGTCATTA 1808 

55 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1320 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

10 



40 



45 



SO 



TGGTCGTCAA TTTCTTGATT 


ATATCTATAA 


TCCTCATTTT CAATATTAGA 


GTCTGTAGAA 


60 


TCATCGATAT TATTATCATT 


CGCATGACTA 


GAAGCAGAAT CATTATTTTT 


ATCATTGCTT 


120 


TCTTCTTTTT TGAAGTCTTT ATTTATCAAG TAAATTTCTT CATCAAAATC AGCTTGTTGA 


180 


GATGTATCAT CTTTATTTTG ATTAGAAAAA TGTGTAGCCT TTGATCTTTT 


TCTTTGCCGT 


240 


CTTTTCTTAG ATGTATTCCT 


CGTAAATAAT 


TCTAATTCAT CTTTATCTTC 


ATTTGATTCT 


300 


TGTTGATCGT TCTTCGTTTT ATCATCCATC AATACTCACA CCCTTTAATA AGATGGTAAA 


360 


TGGGCACGGA ATCTTTCAAT AAATTTCTCT CCACGCTCTT CAAAAGTACT 


ATATTGATCC 


420 


CAACTCGCAC AAGCAGGTGA 


CAATAATACA 


ACATCATTTG GTTCTATAAT 


ATCTTGTACT 


480 


TTATCAACAG CGTCTTCGAC 


ATTGTTCGCT 


TCAATGACCG ATTTCCCTTG 


ACTATTACCT 


540 


AGTTTAGCAA ACTTAGCTTT 


CGTTTGTCCG 


AATACAACCA TCGCGCGAAC 


ATTTTCCATA 


600 


TAAGGAATGA GTTCGTCAAA 


TTCATTCCCT 


CGATCCAAAC CACCACATAA 


CCAAATGATT 


660 


GGTTGATTAA ATGAATTTAA 


GGCAAACTGT 


GTTGCTAGCG TGTTTGTTGC 


TTTGGAATCA 


720 


TTATAATATT TATTAGTTCT 


ATTAGTACCA 


ACATATTGCA ATCTATGCTC 


TATTCCTGAA 


780 


AATGTAGTTA AACTATCAAT 


AATTGCtTTA 


ATAGGTACAC CAGCanAATA 


CAAGCAAGCA 


840 


CAGCTGCTAA TATATTTcTA 


AATTATGTTC 


ACCAGGCAAT ACTAGAt CTT 


CAGTGTTAAT 


900 


AATadGAACA CCTTTATaAA 


CGATAAAACC 


ATCTTtAATA TAAaTACCAT 


CArCTtCTTG 


960 


TTGAGTTGAG AAATACAATG 


TCTTAGCTTT 


TAATTCTTCC GACTCTATCA 


CTTGTCTTTG 


1020 


ATGATAATTA CAAATCAAAT AATCCTCTTC 


CGTTTGATTT TTATATATTT 


GCTTTTTAGC 


1080 


ATTTTGATAG TTTTCTAAAT 


TTTCATGGTA ATCTAGATGC GCCGAATAAA 


TGTTAGTAAT 


1140 


TATAGCAATG TGTGGTTTAT ACTTTTCGAT 


TCCAAGTAAC TGGAATGACG 


ACAACTCTGT 


1200 


AACTAAATAA TCTGTAGGCT 


TTACTTCTTG 


TGCTACTTTA GATGCAACAT AACCAATATT 


1260 


GCCGGATAAT CTTCCAGTTA 


AGCGACTTTT 


TTTAAACATA TCTCCAATTA 


GAGAAGTAAC 


1320 


(2) INFORMATION FOR SEQ ID NO: 81 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4280 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 ■ 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

TTTACACCAA TCAAAAAATC GAACTGATAT AAATAAGTAC AAAGCTTATC TATCAATCCG 60 

70 ATTTAGTTAT AAAACAAAAA AAGCCACAGT AATGTGGCTT TTTGTTATAT TCAGTATCAA 120 

AATGGTATCA ATAGCCATTT TCGGAAGTCA AGAATGGCTT AACAACGCGG TTTAAAGCTA 180 

TCCAATACTA CCTTCCATTT CGAACTTGAT TAAACGGTTC ATTTCGACCG CGTATTCCAT 240 

15 TGGAAGTTCT TTTGTAAATG GTTCGATGAA TCCCATAACA ATCATTTCTG TCGCTTCTTC 300 

TTCAGAAATA CCACGACTCA TTAGATAGAA TAATTGTTCT TCAGAAACTT TTGAAACCTT 360 

GGCTTCATGT TCTAATGATA TTTGATCGTT GAATACTTCG TTATATGGAA TTGTATCTGA 420 

20 

TGTTGATTCG TTATCTAAGA TTAATGTATC ACATTCAATA TTTGAACGAG CACCTTTTGC 480 

TTTACGTCCA AAATGAACAA TACCGCGATA AATAACTTTA CCACCATTTT TAGAAATAGA 54 0 

TTTAGAAACA ATTGTAGAAG ATGTATTAGG TGCTTTATGA ATCATTTTAG CACCGGCATC 600 

25 

TTGAACTTGT CCTTTACCAG CAAATGCAAT AGATAATGTA CTACCTTTTG CACCTTCACC . 660 

TAAAAGAACA CAGTTTGGAT ATTTCATCGT TAACTTAGAA CCTAAGTTAC CATCTACCCA 720 

3Q TTCCATATTT CCGTTTTCAT AAACAAAAGT ACGTTTTGTA ACTAAATTGT ATACATTGTT 78 0 

CGCCCAGTTT TGAATCGTAG TATAACGAAC GTGCGCATCT TTATGCACAA TGATTTCCAC 84 0 

AACAGCAGAG TGTAAAGAAC TAGTTGTATA AACTGGTGCA GTACAACCTT CTACGTAATG 900 

35 TACAGAAGCA CCTTCATCAG CAATGATTAA TGTACGTTCA AATTGACCCA TGTTCTCAGA 960 

GTTAATACGG AAATAAGCTT GTAGTGGCGT ATCTAGTTTG ATATTTTTAG GTACATAAAT 1020 

GAAGGAACCA CCTGACCATA CTGCTGAGTT TAACGCCGCA AATTTGTTAT CTGCTGCAGG 1080 

40 TACTACAGAA GCAAAGTATT TTTTGAATAA TTCTTCATTT TCTTGTAAAG CACTATCTGT 114 0 

ATCTTTAAAG ATAATACCTT TTTCTTCAAG TTCTTTTTCC ATATTATGGT AAACAACTTC 1200 

AGATTCATAT TGAGCAGAAA CACCAGCTAA ATATTTTTGT TCAGCTTCAG GAATTCCTAA 1260 

45 

TTTATCGAAA GTTCTTTTAA TTTCTTCTGG CACTTCATCC CATGAACGTT CAGCTTGTTC 1320 

TGAAGGCTTT ACATAGTAAG TAATGTCATC GAAATTCAAT TCTGATAAGT CGCCACCCCA 1380 

TTGAGGCATT GGCATTTTAT AAAACAATTT TAATGATTTA AGACGGAAAT CTAACATCCA 144 0 

50 

TTCCGGCTCA TTTTTCATGT TAGAAATTTC TCTAACGATA TTCTCAGTTA AACCACGTTC 1500 

TGATCTGAAA ATGGACACAT CATCGTCGTG GAATCCATAT TTATAATCCC CAACATCAGG 1560 
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TTTAATTCAT 


GATGTAAACC 


ATATTATAAC 


AATGACATGA 


CATCTTATAA 


AAATTTTTAT 


1680 




ACTTTTATAT 


GTCTAATATC 


AAAATTATCT 


ATGATTAACA 


GCATTCTATT 


CTTCTTCAGT 


1740 


5 


CGTACCTTCT 


GCTTTACCTT 


CTTTAGCAAC 


AGTACCTTTT 


TCCAATGCTT 


TCCAAGCTAA 


1800 




TGTGGCACAT 


TTAATACGAG 


CTGGGAATTG 


AGATACACCT 


TGCAATGCTT 


CAATATCTCC 


1860 


10 


CATTTCTTCT 


GTAATCACAT AGTCTTCACC 


AAGCATCATT 


TTCGTAAATT 


CTTGGCTCAT 


1920 


TTGCATTGCT 


TPTPPAAflTR 


&. a PPTTT 

/\MJL\j/\l,v, 111 


AACAGCTTGT 


GTCATCATCG 


ATGCACTTGC 


1980 




CATTGAAATC 


ynftLnflLL X 1 


p a cr* r T"Tr' a a a 


CTTAGCATCT 


TTTATAATGC 


CGTCTTCTAT 


2040 


15 


ATCAAATGTT 




vjO 1 w\LLoLA 


TGTCGGGTTA 


TTCATATCTA 


CTGTCATAGA 


2100 




CCCGTTATCT 




TATTTCTAGG 


ATTTTTATAA 


TGATCCATAA 


TGACAGATCT 


2160 




ATATAATTGA 


~i ■ II /*• a ■ i - 1 ■ a r P 
ILlAVjAl lnl 


t a a a a iptp a t» 


AAGAGAAAAA 


CTCCTTCGTT 


TGTTTCAAGG 


2220 


20 


CATTTATTAA 




TCTTCTTTCG 


TGTTGTATAT 


ATAAAAACTC 


GCTCTAGCTG 


2280 




TTGAAGACAC 


ATTTAACCAT 


TTCATTAACG 


GTTGCGCACA 


ATGATGCCCA 


GCTCTAACCG 


2340 




CTACACCTTC 


TGTATCTACG 


GCTGTAGCAA 


CATCGTGTGG 


ATGTACATCT 


TGTAAATTAA 


2400 


25 


ACGTTATTAC 


ACCTGCACGA 


CGATCCTTTG 


GCGGGCCATA 


AATTTCAATT 


CCTTCAATTG 


2460 




CAGACATTTG 


CTCATAAGCA 


TATATCGTTA 


ATTCTTGTTC 


ATATTTATGA 


ATTGCATCAA 


2520 




AACCTATGCG 


TTCTAAATAG 


CGAATAGCTT 


CTGCAAGCCC 


AATTGCTTGA 


GCAATTAATG 


2580 


30 


GAGTACCCGC 


CTCAAATTTA 


GTAGGTAAAT 


CAGCCCATGT 


TGCATCATAC 


TTACTTACAA 


2640 




AATCAATCAT 


GTCGCCACCG 


AACTCAATCG 


GTTCCATTTT 


TTGTAGTAAC 


TCACGTTTAC 


2700 


35 


CAAATAATAC 


GCCAATACCT 


GTTGGTCCAA 


GCATTTTATG 


ACCACTAAAA 


CTATAAAAAT 


2760 


CAGCATTCAT 


TTCTTGCATA 


TCAAGTTTCA 


TATGTGGTGC 


TGctTGCGCC 


CCATCAACAC 


2820 




TGATZATTGC ACCATGTTGA TGAGCTATTT 


CTGCAATGGT 


TTTAACATCA 


TTAATTGTAC 


2880 


40 


CGAGCACATT AGATATATGT 


GCAATAGCAA 


CGATCTTTGT 


TTTATCATTA 


ATCGTTTGCT 


2940 




TAATATCCTC 


GATGTTTAAT 


TCACCGTCAG 


CTGTCATTGG 


TATAAATTTC 


AATGTCGCAT 


3000 




TTTTACGCTT 


TGCTAACTGT 


TGCCAAGGAA 


CAATATTGGC 


ATGATGTTCC 


ATTTCAGTGA 


3060 


45 


CAACAATTTC 


ATCGCCCTCT 


TCAACATTTG 


CATCACCATA 


GCTATGTGCT 


ACAAGGTTAA 


3120 




TCGACGCAGT 


TGTTCCGCGT 


GTAAAAATGA 


TTTCTTCAAA 


ATACTTCGCA 


TTAATAAAAC 


3180 




GACGAACGGT 


TTCACGGGCA 


TTTTCATAAC 


CATCAGTTGC 


CAATGATCCT 


AATGTATGAA 


3240 


SO 


CACCACGATG AACGTTTGAA TTATAACGCT 


TGTAGTAATC 


TTCTAAAACA 


TTTAACACTT 


3300 




GCACAGGCGT 


TTGACTTGTC 


GCTGTTGAAT 


CAAGATATGC 


TAAACGTTTG 


CCATTGACTT 


3360 
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CTTCATTCAC GACCTTTCTT AAATAAAAAT CCTAATCATT TAAATACTGA CGTTGTATTA 3480 

GTCTTATACC AATATCGACA GTCTATATCT ATTACAAACT TTTATTTTCA AAATATTATT 3540 
5 

TAGAAACTTT GCGTTCAATT ACTTCTCTCA ATTGACGTTT AACGTCTTCG ATAGGTAATT 3600 
CACGTACTAC TGGATCTAAG AAACCATGTA TAACAAGACG TTCCGCTTCT CTTTGAGAAA 3660 
TACCACGACT CATTAAATAG TAAAGTTGAT CTGGATCAAC ACGACCTACT GATGCAGCAT 3720 

10 

GACCAGCTTG TACATCATCT TCATCAATTA ATAAAATAGG ATTOGCGTCA CCACGAGCAT 3780 
GTTCAGATAA CATTAATACA CGTGATTCCT GATTAGCAAT TGATTTAGTT CCACCATGCT 3840 

1$ TAATGTAGCC GATACCATTA AATACAGACG ATGCATGTTC TTTCATAACA CCATGTTTAA 3900 

GGATATAACC ATCTGTTTCT TTACCATATT GTACGATTTT AGATGTTAGA TTAATTTTTT 3960 
GTTCGCCTGT ACCTACAACT ACTGATTTAA GTGAACTTGT TGAACGATCA CCAAATAAAT 4 020 

20 TTGTTGTATT ATCAATAATT TGGCTACCCT CATTCATTAA ACCTAGTGCC CAATTAATTG 4080 

AGGCATCCGC TTCAGTAATA CCACGTCGAA TGATATGACC TGTAAAGCCT TTATCCATAT 4140 
AGTCCACTGA GCCATATGTG ATATTTGAAT TTGCACCAGC AATCACTTCA GAAATAATAT 4200 

25 TtAATTGATT TCCTTCACCA GATGCATTTG mTAAGTAATT TTCAACATAT GTGACTTCGG 4260 

CGCTTTCTTC AGTAACGATG 4280 
(2) INFORMATION FOR SEQ ID NO: 82: 

30 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15598 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 



- (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

4Q TCnGACTCGA ACGGTGmAAC TAttCCGTTG TaATTCCgGA GgAAsCAAGG TATGCCCATC 60 

TGCaAAGAAA gaATGsAATG AACTTTTTGG AAATGTAGAA GTGGTAAATA AAGATAAAGG 120 

ATATTACATT CTGAGAAGTA TAAAAGCTTG AAATGAAATG GATATTCTGT TATAGTTATA 180 

4$ TAATGTAAAA ATTTATGTTC AATAAGTGTG TACTTTTACG TTAAATAGAT AAGTTAATTA 240 

AGAATAAATA TAGAATCGAA AATGGTGTCA TCATTAGTGT TGCCGTTTTC TTTTTGTCTT 300 

TTTATTAATA TGCTTATGGT ATTTAGCTAA AAGCGGATCA CATAATTTTT GAGGGGTGAA 360 

50 TCTGTTTGGC AGGTCAAGTT GTCCAATATG GAAGACATCG TAAACGTAGA AACTACGCGA 420 

GAATTTCAGA AGTATTAGAA TTACCAAACT TAATAGAAAT TCAAACTAAA TCTTACGAGT 480 
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CTGGTAATTT GTCATTAGAG 


TTTGTGGATT 


ACCGTTTAGG 


AGAACCAAAA 


TATGATTTAG 


600 




AAG AATC TAA AAACCGTGAC 


GCTACTTATG 


CTGCACCTCT 


TCGTGTAAAA 


GTGCGTCTAA 


660 


5 


TCATTAAAGA AACAGGAGAA 


GTTAAAGAAC AAGAAGTCTT 


TATGGGTGAT 


TTCCCATTAA 


720 




TGACTGATAC AGGTACGTTC 


GTTATCAATG 


GTGCAGAACG 


TGTAATCGTA 


TCTCAATTAG 


780 


10 


TTCGTTCACC ATCCGTTTAT 


TTCAATGAAA AAATCGACAA AAATGGTCGT 


GAAAACTATG 


840 


ATGCAACAAT TATTCCAAAC 


CGTGGTGCAT 


GGTTAGAATA 


TGAAACAGAT 


GCTAAAGATG 


900 




TTGTATACGT ACGTATTGAT AGAACACGTA AACTACCATT AACAGTATTG TTACGTGCAT 


960 


15 


TAGGTTTCTC AAGCGACCAA GAAATTGTTG ACCTTTTAGG 


TGACAATGAA 


TATTTACGTA 


1020 




ATAC 1 1 1AGA G AAAGACGG C 


ACTGAAAACA 


CTGAACAAGC 


GTTATTAGAA 


ATCTATGAAC 


1080 




GTTTACGTCC AGGTGAACCA 


CCAACTGTTG 


AAAATGCTAA 


AAGTCTATTG 


TATTCACGTT 


1140 


20 


ICl I JAiATCC AAAACGCTAT 


GACTTAGCAA 


GCGTGGGTCG 


TTATAAAACA 


AACAAAAAAT 


1200 




iAUil 1 1AAA ALIA 1 tAs I 1 i A 


TTTAATCAAA 


AATTAGCTGA 


GCCAATTGTA 


AATACTGAAA 


1260 




1 Lt-j IbAAA 1 ibIACjTTGAA 


GAAGGTACAG 


TGCTTGATCG 


TCGTAAAATC 


GACGAAATCA 


1320 


25 


1 I Vj 1 X 1 bAA I LAAA I 


GCAAACAGCG 


AAGTGTTTGA 


ATTGCATGGT 


AGCGTTATAG 


1380 






TCAATTAAAG 


TATATGTTCC 


TAACGATGAT 


GAAGGTCGTA 


1440 




^ununnLlul riril 1 IjVj I/VrVl 


GCTTTCCCTG 


ACTCAGAAGT 


TAAATGCATT 


ACACCAGCAG 


1500 


30 


ATATCATTGC TTCAATGAGT 


TACTTCTTTA ACTTATTAAG 


CGGTATTGGA 


TATACAGATG 


1560 




ATATTGACCA TTTAGGTAAC 


CGTCGTTTAC 


GTTCTGTAGG 


TGAATTACTA 


CAAAACCAAT 


1620 


35 


TCCGTATCGG TTTATCAAGA 


ATGGAAAGAG 


TTGTACGTGA 


AAGAATGTCA 


ATTCAAGATA 


1680 


CTGAGTCTAT CACACCTCAA 


CAATTAATTA 


ATATTCGACC 


TGTTATTGCA 


TCTATTAAAG 


1740 




aattCtttgg TAGCTCTCAA 


TTATCACAAT 


TCATGGACCA 


AGCAAACCCA 


TTAGCTGAGT 


1800 


40 


taacgcataa acgtcgtcta 


TCAGCATTAG 


GACCTGGTGG 


TTTAACACGT 


GAACGTGCTC 


1860 




AAATGGAAGT ACGTGACGTT 


CACTACTCTC 


ACTATGGCCG 


TATGTGTCCA 


ATTGAAACAC 


1920 




CTGAGGGACC aaacattgga 


TTGATTAACT 


CATTATCAAG 


TTATGCACGT 


GTAAATGAAT 


1980 


45 


TCGGCTTTAT TGAAACACCA 


TATCGTAAAG 


TTGATTTAGA 


TACACATGCT 


ATCACTGATC 


2040 




AAATTGACTA TTTAACAGCT 


GACGAAGAAG 


ATAGCTATGT 


TGTAGCACAA 


GCAAACTCTA 


2100 




AATTAGATGA AAATGGTCGT 


TTCATGGATG 


ATGAAGTTGT 


ATGTCGTTTC 


CGTGGTAACA 


2160 


50 


ATACAGTTAT GGCTAAAGAA 


AAAATGGATT 


ATATGGATGT 


ATCGCCGAAG 


CAAGTTGTTT 


2220 




CAGCAGCGAC AgcATGTATT CCATTCTTAG AAAATGATGA CTCAAACCGT 


GCATTGATGG 


2280 
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CAGGTATGGA ACACGTTGCA GCACGTGATT CTGGTGCGGC TATTACAGCT AAGCACAGAG 2400 

GTCGTGTTGA ACATGTTGAA TCTAATGAAA TTCTTGTTCG TCGTCTAGTT GAAGAGAACG 2460 

GCGTTGAGCA TGAAGGTGAA TTAGATCGCT ATCCATTAGC TAAATTTAAA CGTTCAAACT 2520 

CAGGTACATG TTACAACCAA CGTCCAATCG TTGCAGTTGG AGATGTTGTT GAGTATAACG 2580 

AGATTTTAGC AGATGGACCA TCTATGGAAT TAGGAGAAAT GGCATTAGGT AGAAACGTAG 2640 

10 

TAGTTGGTTT CATGACTTGG GACGGTTACA ACTATGAGGA TGCCGTTATC ATGAGTGAAA 2700 

GACTTGTGAA AGATGACGTG TATACTTCTA TTCATATTGA AGAGTATGAA TCAGAAGCAC 2760 

15 GTGATACTAA GTTAGGACCT GAAGAAATCA CAAGAGATAT TCCTAATGTT TCTGAAAGTG 2820 

CACTTAAGAA CTTAGACGAT CGTGGTATCG TTTATATTGG TGCAGAAGTA AAAGATGGAG 2880 

ATATTTTAGT TGGTAAAGTA ACGCCTAAAG GTGTAACTGA GTTAACTGCC GAAGAAAGAT 2940 

20 TGTTACATGC AATCTTTGGT GAAAAAGCAC GTGAAGTTAG AGATACTTCA TTACGTGTAC 3000 

CTCACGGCGC TGGCGGTATC GTTCTTGATG TAAAAGTATT CAATCGTGAA GAAGGCGACG 3060 

ATACATTATC ACCTGGTGTA AACCAATTAG TACGTGTATA TATCGTTCAA AAACGTAAAA 3120 

25 TTCATGTTGG TGATAAGATG TGTGGTCGAC ATGGTAACAA AGGTGTCATT TCTAAGATTG 3180 

TTCCTGAAGA AGATATGCCT TACTTACCAG ATGGACGTCC GATCGATATC ATGTTAAATC 3240 

CTCTTGGTGT ACCATCTCGT ATGAACATCG GACAAGTATT AGAGCTACAC TTAGGTATGG 3300 

30 

CTGCTAAAAA TCTTGGTATT CACGTTGCAT CACCAGTATT TGACGGTGCA AACGATGACG 3360 

ATGTATGGTC AACAATTGAA GAAGCTGGTA TGGCTCGTGA TGGTAAAACT GTACTTTATG 3420 

ATGGACGTAC AGGTGAACCA TTCGATAACC GTATTTCAGT AGGTGTAATG TACATGTTGA 34 80 

35 

AACTTGCGCA CATGGTTGAT GATAAATTAC ATGCGCGTTC AACAGGACCA TATTCACTTG 354 0 

tTACACAACA ACCACTTGGC GGTAAAGCGC AATTCGGTGG ACAACGTTTT GGTGAGATGG 3600 

AGGTATGGGC ACTTGAAGCA TATGGTGCTG CATACACATT ACAAGAAATC TTAACTTACA 3660 

40 

AATCCGATGA TACAGTAGGA CGTGTGAAAA CATACGAGGC TATTGTTAAA GGTGAAAACA 3720 

TCTCTAGACC AAGTGTTCCA GAATCATTCC GAGTATTGAT GAAAGAATTA CAAAGTTTAG 3780 

45 GTTTAGATGT AAAAGTTATG GATGAGCAAG ATAATGAAAT CGAAATGACA GACGTTGATG 3840 

ACGATGATGT TGTAGAACGC AAAGTAGATT TACAACAAAA TGATGCTCCT GAAACACAAA 3900 

AAGAAGTTAC TGATTAATAC GCAATTTACA AAACAGGCAA AAAGATACTA AGCTGAATTT 3960 

50 TATTGATGAT TCAGTTTAGT ACTTTAAGCC ATTTTAAATA AATGCAAATC AATCAAATAG 4020 

CACAGCTAAT CTAAATTGAA GGAGGTAGGC TCCTTGATTG ATGTAAATAA TTTCCATTAT 4080 
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AAACCTGAAA 


CAATCAACTA CCGTACATTA 


AAACCTGAAA 


AAGATGGTCT 


ATTCTGTGAA 


4200 




AGAATTTTCG 


GACCTACAAA AGACTGGGAA 


TGTAGTTGTG 


GTAAATACAA 


ACGTGTTCGC 


4260 


5 


TACAAAGGCA 


TGGTCTGTGA CAGATGTGGA 


GTTGAAGTAA 


CTAAATCTAA 


AGTACGTCGT 


4320 




GAAAGAATGG 


GTCACATTGA ACTTGCTGCT 


CCAGTTTCTC 


ACATTTGGTA 


TTTCAAAGGT 


4380 


10 


ATACCAAGTC 


GTATGGGATT ATTACTTGAC ATGTCACCAA GAGCATTAGA AGAAGTTATT 


4440 


TACTTTGCTT 


CTTATGTTGT TGTAGATCCA 


GGTCCAACTG 


GTTTAGAAAA 


GAAAACTTTA 


4500 




TTATCTGAAG 


CTGAATTCAG AGATTATTAT 


GATAAATACC 


CAGGTCAATT 


CGTTGCAAAA 


4560 


15 


ATGGGTGCAG AAGGTATTAA AGATTTACTT GAAGAGATTG ATCTTGACGA AGAACTTAAA 


4620 




TTGTTACGCG 


ATGAGTTGGA ATCAGCTACT 


GGTCAAAGAC 


TTACTCGTGC 


AATTAAACGT 


4680 




TTAGAAGTTG 


TTGAATCATT CCGTAATTCA 


GGTAACAAAC 


CTTCATGGAT 


GATTTTAGAT 


4740 


20 


GTACTTCCAA 


TCATCCCACC AGAAATTCGT 


CCAATGGTTC 


AATTAGATGG 


TGGACGATTT 


4800 




GCAACAAGTG 


ACTTAAACGA CTTATACCGT 


CGTGTAATTA 


ATCGAAATAA 


TCGTTTGAAA 


4860 




CGTTTATTAG 


ATTTAGGTGC ACCTGGTATC 


ATCGTTCAAA 


ACGAAAAACG 


TATGTTACAA 


4920 


25 


GAAGCCGTTG ACGCTTTAAT TGATAATGGT 


CGTCGTGGTC 


GTCCAGTTAC 


TGGCCCAGGT 


4980 




AACCGTCCAT 


TAAAATCTTT ATCTCATATG 


TTAAAAGGTA 


AACAAGGTCG 


TTTCCGTCAA 


5040 




AACTTACTTG 


GTAAACGTGT TGACTATTCA 


GGACGTTCAG 


TTATTGCAGT 


AGGTCCAAGC 


5100 


30 


TTGAAAATGT 


ACCAATGTGG TTTACCAAAA 


GAAATGGCAC 


TTGAACTATT 


TAAACCATTC 


5160 




GTAATGAAAG 


AATTAGTTCA ACGTGAAATT 


GCAACTAACA 


TTAAAAATGC 


GAAGAGTAAA 


5220 


35 


ATCGAACGTA 


TGGATGATGA AGTTTGGGAC 


GTATTGGAAG 


AAGTAATTAG 


AGAACATCCT 


5280 


GTATTACTTA 


ACCGTGCACC AACACTTCAT 


AGACTTGGTA 


TTCAAGCATT 


TGAACCAACT 


5340 




TTACTTGAAG 


GTCGTGCGAT TCGTCTACAT 


CCACTTGTAA 


CAACAGCTTA 


TAACGCTGAC 


5400 


40 


TTTGACGGTG 


ACCAAATGGC GGTTCACGTT 


CCTTTATCAA 


AAGAGGCACA 


AGCTGAAGCA 


5460 




AGAATGTTGA 


TGTTAGCAGC ACAAAACATC 


TTGAACCCTA 


AAGATGGTAA 


ACCTGTAGTT 


5520 




ACACCATCAC 


AAGATATGGT ACTTGGTAAC 


TATTACCTTA 


CTTTAGAAAG 


AAAAGATGCA 


5580 


45 


GTAAATACAG 


GCGCAATCTT TAATAATACA 


AATGAAGTAT 


TAAAAGCATA 


TGCAAATGGC 


5640 




TTTGTACATT 


TACACACTAG AATTGGTGTA 


CATGCAAGTT 


CGTTCAATAA 


TCCAACATTT 


5700 




ACTGAAGAAC 


AAAACAAAAA GATTCTTGCT 


ACGTCAGTAG 


GTAAAATTAT 


ATTCAATGAA 


5760 


50 


ATCATTCCAG 


ATTCATTTGC TTATATTAAT 


GAACCTACGC 


AAGAAAACTT 


AGAAAGAAAG 


5820 




ACACCAAACA 


GATATTTCAT CGATCCTACA 


ACTTTAGGTG 


AAGGTGGATT 


AAAAGAATAC 


5880 
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GAAGTATTCA ACAGATTTAG CATCACTGAT ACATCAATGA TGTTAGACCG TATGAAAGAC 6000 

TTAGGATTCA AATTCTCATC TAAAGCTGGT ATTACAGTAG GTGTTGCTGA TATCGTAGTA 6060 

TTACCTGATA AGCAACAAAT ACTTGATGAG CATGAAAAAT TAGTCGACAG AATTACAAAA 6120 

CAATTCAACC GTGGTTTAAT CACTGAAGAA GAAAGATATA ATGCAGTTGT TGAAATTTGG 6180 

ACAGATGCAA AAGATCAAAT TCAAGGTGAA TTGATGCAAT CACTTGATAA AACTAACCCA 6240 

ATCTTCATGA TGAGTGATTC AGGTGCCCGT GGTAACGCAT CTAACTTTAC ACAGTTAGCA 6300 

GGTATGCGTG GATTGATGGC CGCACCATCT GGTAAGATTA TCGAATTACC AATCACATCT 6360 

TCATTCCGTG AAGGTTTAAC AGTACTTGAA TACTTCATCT CAACTCACGG TGCACGTAAA 6420 

GGTCTTGCCG ATACAGCACT TAAAACAGCT GACTCAGGAT ATCTTACTCG TCGTCTTGTT 64 80 

GACGTGGCAC AAGATGTTAT TGTTCGTGAA GAAGACTGTG GTACTGATAG AGGTTTATTA 6540 

20 GTTTCTGATA TTAAAGAAGG TACAGAAATG ATTGAACCAT TTATCGAACG TATTGAAGGT 6600 

CGTTATTCTA AAGAAACAAT TCGTCATCCT GAAACTGATG AAATAATCAT TCGTCCTGAT 6660 

GAATTAATTA CACCTGAAAT TGCTAAGAAA ATTACAGATG CTGGTATTGA ACAAATGTAT 6720 

25 ATTCGCTCAG CATTTACTTG TAACGCACGA CATGGTGTTT GTGAAAAATG TTACGGTAAA 6780 

AACCTTGCTA CTGGTGAAAA AGTTGAAGTT GGTGAAGCAG TTGGTACAAT TGCAGCCCAA 6840 

TCTATCGGTG AACCAGGTAC ACAGCTTACA ATGCGTACAT TCCATACAGG TGGGGTAGCA 6900 

GGTAGCGATA TCACACAAGG TCTTCCTCGT ATTCAAGAGA TTTTCGAAGC ACGTAACCcT 6960 

AAAGGTCAAG CGGTAATTAC GGAAATCGAA GGTGTCGTAG AAGATATTAA ATTAGCAAAA 7020 

GATAGACAAC AAGAAATTGT TGTTAAAGGT GCTAATGAAA CAAGATCATA CCTTGCTTCA 7080 

GGTACTTCAA GAATTATTGT AGAAATCGGT CAACCAGTTC AACGTGGTGA AGTATTAACT 7140 

GAACWTTCTA TTGAACCTAA GAATTACTTA TCTGTTGCTG GATTAAACGC GACTGAAAGC 7200 

TACTTATTAA AAGAAGTACA AAAAGTTTAC CGTATGCAAG GTGTAGAAAT CGACGATAAA 7260 

40 

CACGTTGAGG TTATGGTTCG ACAAATGTTA CGTAAAGTTA GAATTATCGA AGCAGGTGAT 7320 

ACGAAGTTAT TACCAGGTTC ATTAGTTGAT ATTCATAACT TTACAGATGC AAATAGAGAA 73 80 

45 GCATTTAAAC ACCGTAAGCG TCCTGCAACA GCTAAACCAG TATTACTTGG TATTACTAAA 7440 

GCATCACTTG AAACAGAAAG TTTCTTATCT GCAGCATCAT TCCAAGAAAC AACAAGAGTT 7500 

CTTACAGATG CAGCAATTAA AGGTAAGCGT GATGACTTAT TAGGTCTTAA AGAAAACGTA 7560 

SO ATTATTGGTA AGTTAATTCC AGCTGGTACT GGTATGAGAC GTTATAGCGA CGTAAAATAC 7620 

GAAAAAACAG CTAAACCAGT TGCAGAAGTT GAATCTCAAA CTGAAGTAAC GGAATAACAA 7680 
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ATGTTGACGA ATTCTCTTGT TCAATGTTAA 
GGAGGATAAA TTATTGTCTA AGGAAAAAGT 
5 TGGTCTTAAA GAAACGCTTA AAGCGTTAAA 

TGAAGACGTT GAAGTATATT TAATGACTCG 
ACCTGTATCT TTTTTCAAAA GCAAACATGC 

10 

TGCGACAATA GTAGCATTGA TTAAATGAGA 
TTTAACCTAA AAATGAACCA CCTGGATGTG 
ATCACATGCC AACTATTAAC CAATTAGTAC 

15 

CAGATTCTCC AGCTTTAAAT AAAGGTTTCA 
ACTCACCACA AAAACGTGGT GTATGTACTC 

20 ACTCAGCGTT ACGTAAATAT GCACGTGTGc 

ACATCCCTGG TATCGGACAT AACTTACAAG 
GTGTAAAAGA CTTACCAGGT GTGCGTTACC 

25 GTGTTGACGG ACGTAGACAA GGTCGTTCAT 
AATTTAGTTT TTAATTAAAT CTTAAACTTA 
CATTATGCCT CGTAAAGGAT CAGTACCTAA 

30 CTCTAAGTTA GTAACTAAAT TAATTAACAA 

ACAAAGAATT CTTTATTCAG CATTCGACCT 
GAAGTATTCG AAGAAGCAAT CAACAACATT 

35 

GTAGGTGGTT CTAACTATCA AGTACCAGTA 
GGTTT ACGTT GGTTAGTTAA CTATGCACGT 
TTAGCTAACG AAATTTTAGA TGCAGCAAAT 

40 

GACACTCACA AAATGGCTGA AGCAAACAAA 
AGCTTTTACC CTGAGTGTGT TCTATATTAA 

4S CATCGCCATA TCTATCGTAT TTATTCAGTA 

CTAGAGAATT TTCATTAGAA AAAACTCGTA 
GTAAAACGAC TACGACTGAA CGTATTCTTT 

50 AAaCACACGA AGGTGCTTCA CAAATGGACT 

CTATCACATC TGCTGCAACA ACAGCAGCTT 

55 



TATATTAAAG GTTGATGCAA GCAGAACTTT 7800 

tGCACGCTTT AACAAACAAC ATTTTGTAGT 7860 

GAAAGATCAA GTTACATCTT TGATTATTGC 7920 

CGTGTTAAGC CAAATCAATC AGAAAAATAT 7980 

TTTGGGTAAA CATGTAGGTA TTAACGTCAA 8040 

ATTAGTAAGT GTTTTACTTA CTAAATTTTA 8100 

TGGGATTAAA AAGTGAAGAG AGGAGGACAT 8160 

GTAAACCAAG ACAAAGCAAA ATCAAAAAAT 8220 

ACAGTAAAAA GAAAAAATTT ACTGACTTAA 8280 

GTGTAGGTAC AATGACACCT AAAAAACCTA 8340 

gTtTATCAAA CAACATCGAA ATTAACGCAT 8400 

AACACAGTGT TGTACTTGTA CGTGGTGGAC 8460 

ATATTGTACG TGGAGCACTT GATACTTCAG 8520 

TATACGGAAC TAAGAAACCT AAAAACTAAG 8580 

AAATATTTAA TATAAGGAAG GGAGGATTTA 8640 

AAGAGACGTA TTACCAGATC CAATTCATAA 8700 

AATTATGTTA GATGGTAAAC GTGGAACAGC 8760 

AGTTGAACAA CGCAGgtTCG TGATGCATTA 8B20 

ATGCCAGTAT TAGAAGTTAA AGCTCGTCGC 8880 

GAAGTTCGTC CAGAGCGTCG TACTACTTTA 8940 

CTTCGTGGTG AAAAAACGAT GGAAGATCGT 9000 

AATACAGGTG GTGCCGTTAA GAAACGTGAG 9060 

GCATTTGCTC ACTACCGTTG GTAAGATAAA 9120 

TGAATTTTCA TTAAGCGTTC ATGCTTAGGG 9180 

ATATAAACTG GAAGGAGAAA AAATACATGG 9240 

ATATCGGTAT CATGGCTCAC ATTGATGCTG 9300 

ATTACACTGG CCGTATCCAC AArGknGGTG 93 6 0 

GGATGGAGCA AGAACAAGAC CGTGGTATTA 9420 

GGGAAGGTCA CCGTGTAAAC ATTATCGATA 9480 
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CAGTTACAGT ACTTGATGCA CAATCAGGTG 
AGGCTACAAC TTATGGTGTT CCACGTATCG 
5 CTAACTTCGA ATACTCTGTA AGTACATTAC 
TCCAATTACC AATTGGTGCG GAAGACGAAT 
AATGTTTCAA ATATACAAAT GATTTAGGTA 

70 

ACCACTTAGA TAGAGCTGAA GAAGCTCGTG 
GCGACGAATT AATGGAAAAA TATCTTGGTG 
AAGCTATCCG CCAAGCTaCt AcTAACGTAG 

75 

TCAAAAACAA AGGTGTTCAA TTAATGCTTG 
TAGACGTTAA ACCAATTATT GGTCACCGTG 

20 AAGCAGACGA TTCAGCTGAA TTCGCTGCAT 
TTGGTAAATT AACATTCTTC CGTGTGTATT 
AGAACTCTAC TAAAGGTAAA CGTGAACGTG 

25 CACGTCAAGA AATCGATACT GTATACTCTG 
ATACAGGTAC TGGTGATACT TTATGTGGTG 
AATTCCCAGA GCCAGTTATT CACTTATCAG 

30 AAATGACTCA AGCTTTAGTT AAATTACAAG 
ACGAAGAAAC TGGACAAGTT ATCATCGGTG 
TAGACCGTAT GAAGAAAGAA TTCAACGTTG 

35 

ATCGTGAAAC ATTCAAATCA TCTGCACAAG 
GTCGTCGTCA ATACGGTGAT GTTCACATTG 
TCGAATTCGA AAACGCTATC GTTGGTGGTG 

40 

AAGCTGGTCT TAAAGATGCT ATGGAAAATG 
TTAAAGCTAA ATTATATGAT GGTTCATACC 

45 AAATTGCTGC ATCATTAGCA CTTAAAGAAG 
AACCAATGAT GAAAGTAACT ATTGAAATGC 
ACGTAACATC TCGTCGTGGA CGTGTTGATG 

50 TTAATGCTTA TGTACCACTT TCAGAAATGT 
CTCAAGGTCG CGGTACTTAC ACTATGTACT 

55 



TTGAACCTCA AACTGAAACA GTTTGGCGTC 9600 

TATTTGTAAA CAAAATGGAC AAATTAGGTG 9660 

ATGATCGTTT ACAAgCTAAC GCTGCTCCAA .9720 

TCGAAGCAAT CATTGACTTA GTTGAAATGA 9780 

CTGAAATTGA AGAAATTGAA ATTCCTGAAG 984 0 

CTAGCTTAAT CGAAGCAGTT GCAGAAACTA 9900 

ACGAAGAAAT TTCAGTTTCT GAATTAAAAG 9960 

AATTCTACCC AGTACTTTGT GGTACAGCTT 10020 

ACGCTGTAAT TGATTACTTA CCTTCACCAC' 1008 0 

CTAGCAACCC TGAAGAAGAA GTAATCGCGA 1014 0 

TAGCGTTCAA AGTTATGACT GACCCTTATG 10200 

CAGGTACAAT GACATCTGGT TCATACGTTA 10260 

TAGGTCGTTT ATTACAAATG CACGCTAACT 10320 

GAGATATCGC TGCTGCGGTA GGTCTTAAAG 10380 

AGAAAAATGA CATTATCTTG GAATCAATGG 10440 

TAGAGCCAAA ATCTAAAGCT GACCAAGATA 10500 

AAGAAGACCC AACATTCCAT GCACACACTG 10560 

GTATGGGTGA GCTTCACTTA GACATCTTAG 10620 

AATGTAACGT AGGTGCTCCA ATGGTTTCAT 10680 

TTCAAGGTAA ATTCTCTCGT CAATCTGGTG 1074 0 

AATTCACACC AAACGAAACA GGCGCAGGTT 10800 

TAGTTCCTCG TGAATACATT CCATCAGTAG 10860 

GTGTTTTAGC AGGTTATCCT TTAATTGATG 10920 

ATGATGTCGA TTCATCTGAA ATGGCCTTCA 10980 

CTGCTAAAAA ATGTGATCCT GTAATCTTAG 11040 

CTGAAGAGTA CATGGGTGAT ATCATGGGTG 11100 

GTATGGAACC TCGTGGTAAT GCACAAGTTG 11160 

TCGGTTATGC AACATCATTA CGTTCAAACA 11220 

TCGATCACtA TGCTGAAGTT CCaAAATCaA 11280 
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GCCTAGGTTA AAATACAAGG TGAGCTTAAA TGTAAGCTAT CATCTTTATA GTTTGATTTT 11400 

TTGGGGTGAA TGCATTATAA AAGAATTGTA AAATTCTTTT TGCATCGCTA TAAATAATTT 11460 

CTCATGATGG TGAGAAACTA TCATGAGAGA TAAATTTAAA TATTATTTTT AATTAGAATA 11520 

GGAGAGATTT TATAATGGCA AAAGAAAAAT TCGATCGTTC TAAAGAACAT GCCAATATCG 11580 

GTACTATCGG TCACGTTGAC CATGGTAAAA CAACATTAAC AGCAGCAATC GCTACTGTAT 11640 

TAGCAAAAAA TGGTGACTCA GTTGCACAAT CATATGACAT GATTGACAAC GCTCCAGAAG 11700 

AAAAAGAACG TGGTATCACA ATCAATACTT CTCACATTGA GTACCAAACT GACAAACGTC 11760 

ACTACGCTCA CGTTGACTGC CCAGGACACG CTGACTACGT TAAAAACATG ATCACTGGTG 11820 

CTGCTCAAAT GGACGGCGGT ATCTTAGTAG TATCTGCTGC TGACGGTCCA ATGCCACAAA 11880 

CTCGTGAACA CATTCTTTTA TCACGTAACG TTGGTGTACC AGCATTAGTA GTATTCTTAA 11940 

ACAAAGTTGA CATGGTTGAC GATGAAGAAT TATTAGAATT AGTAGAAATG GAAGTTCGTG 12000 

ACTTATTAAG CGAATATGAC TTCCCAGGTG ACGATGTACC TGTAATCGCT GGTTCAGCAT 12060 

TAAAAGCTTT AGAAGGCGAT GCTCAATACG AAGAAAAAAT CTTAGAATTA ATGGAAGCTG 12120 

25 TAGATACTTA CATTCCAACT CCAGAACGTG ATTCTGACAA ACCATTCATG ATGCCAGTTG 12180 

AGGACGTATT CTCAATCACT GGTCGTGGTA CTGTTGCTAC AGGCCGTGTT GAACGTGGTC 12240 

AAATCAAAGT TGGTGAAGAA GTTGAAATCA TCGGTTTACA TGACACATCT AAAACAACTG 12300 

30 TTACAGGTGT TGAAATGTTC CGTAAATTAT TAGACTACGC TGAAGCTGGT GACAACATTG 12360 

GTGCATTATT ACGTGGTGTT GCTCGTGAAG ACGTACAACG TGGTCAAGTA TTAGCTGCTC 12420 

CTGGTTCAAT TACACCACAT ACTGAATTCA AAGCAGAAGT ATACGTATTA TCAAAAGACG 12480 

AAGGTGGACG TCACACTCCA TTCTTCTCAA ACTATCGTCC ACAATTCTAT TTCCGTACTA 12540 

CTGA6GTAAC TGGTGTTGTT CACTTACCAG AAGGTACTGA AATGGTAATG CCTGGTGATA 12600 

ACGTTGAAAT GACAGTAGAA TTAATCGCTC CAATCGCGAT TGAAGACGGT ACTCGTTTCT 12660 

CAATCCGTGA AGGTGGACGT ACTGTAGGAT CAGGCGTTGT TACTGAAATC ATTAAATAAT 12720 

TTCTAATTTC TTAGATTTTA TATAAAAAGA AGATCCCTCA ATCGAGGGGt CTTTTTTTAA 12780 

TGTGTAAATT TTGTAATGGC TATTCGATTT AGAAGAACAA TAATTGATGA AAGACTGACT 12840 

AATAAAACTT ATAACTGATA ATACTGTTTA AATAAAATTG TTGAGTCTTG GACATTGTAA 12900 

AATGCTCCCT TCAAAGTTTT CATTTTTTCa ATGTCTACTT TGAAGGGAGC ATTTCATTAG 12960 

50 TTTATGTCTC AGATTCATAT CTTTCAATTA ATTTAAATGC TTAATTTGTT TTAAATACTT 13020 

GCTCTAATTC TATGATTTTT AAAAATACAG CTACAGCGTA TTTTAATGAT TTTTCATCAA 13080 
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TCAGAAAGAA TGCACCTGGT CGTACTTTCA AATAATGTGA 
TTAAATCTGA TTCATTAAAG CGTACATGTA AGTCATTTGT 
ATGCTTTCTC GTTATTATGG ACAGGCAAAT ACCCTTTAAT 
TATCATTTGC TATTGCTAAA CCTTGTAGAA GCTTATCCAT 
GTATATCTGA ATCGAAAGTT CTAACTGTAC CTTTACAAAA 
TATCTGTGGT GCCTGCTTGA ATCATTCCAA ATGAAAGTAC 
TCGTACGTGA AATTATTTTT TGTGCACTTA AAATGAACTC 
CAATGGTTTC ATGAGGTTTG GCACCATGAC CACCACGACC 
CATCTGGAGA GGCCATGATT GCCCCCGCAC GTGAATGAAT 
TCCATAAATG TGTACCGTAA ATTCTATCTA CATTTTCCAG 
CTTGAGAACC ACCTGGCATG ATTTCTTCAC CGTACTGGAA 
CTAATAAATG TTTATGTTCA TCTAAAATCT CTGCTACAGT 
CATCATGCCC ACACGCATGC ATACATCCTG GATTTTTAGA 
ATTCCTCGAC AGGTAACGCA TCAAAGTCAG CTCTTAATGC 
AGCCTTTAAA TGTGGCTTTG ATACCATTGC GGCCGATAGG 
ACTGGCTTAA TTGGTTAACA ATATAATCAT GTGTTTGAAA 
GATATTGGTG TAAATAACGT CTGAGTTGAA TTGTTTTATT 
GGAACCAATC TAACACCCTT ATCACTACTT TCTAAAATAA 
ATGAAATTAT CGTACTAAAT GATTGCTTTG AGATATTTTA 
CAAGTTATGT AGAATTACTG TATGATAAAG GTATTACCAA 
TATACTGTGG TTCAATCATT ACATGAGTTT TTAGAGGAAA 
AATGGTTTGT ATAATGAAAT AGATACAATT GAAGGTGCAA 
AATGGGAAAT CATACATTAA CTTATCTTCA AATAATTATT 
GATTTGAAAT CaGctGCAAA AGCAGCTATT GATACACATG 
CGTACAATCA ATGGTACATT AGATTTACAC GACGAATTAG 
AAAGGAACAG AAGCTGCAAT AGCTTATCAA TCAGGATTTA 
TCAGCTGTCA TGAATAAAAA TGATGCTATT TTATCAGATG 
ATTGATGGAT GTCGCTTATC TAAAGCTAAA ATTATTCGAG 
GATTTACGTG CGAAAGCAAA AGAAGCAGTT GAATCAGGTC 



AAAATCTTCT CCAATCATCA 13200 

TGCTTCTTTA ATAACTTGAT 13260 

ATAATTCAAA TCATAGTTAA 13320 

TTTGTCCATT ACATGATTCT 13380 

TGCTTGATCA GGAATAACGC 1344 0 

AGCTTGTTTA ACTGGATCGA 13500 

TGCCATGATT ACTATTGGGT 13560 

TTTAAATGTG ACGCTAAATT 13620 

AGTTCCAGTA GGATAACCAC 13680 

ACATCCAGCA TCTATCATTT 13740 

TATTAATACA ACATTACCTT 13800 

AAGTAAAATT GCTGTATGAC 13860 

CTTATAAGGC ACATCGTTTA 13920 

AATGGTAGGT CCTGTGCCCA 13980 

AGTTTCAATA TCACAAGATA 1404 0 

TTCTTCAAAA GATAACTCAG 14100 

TTCTTTATTA TTTGCTAGTT 14160 

TGTTTATAGT ATAACATTTT 14220 
TCTATGAATG ATAAGGCTTT . 14280 

ACAATACTTA AGGGGGATTA 14340 

ATATAAATTA TCTAAAAGAA 14400 

ACGGACCAGA AATCAAAATC 14460 

TAGGACTAGC AACAAATGAA 14520 

GTGTAGGTGC AGGCGCTGTT 14580 

AAGAAACACT AGCAAAATTT 14640 

ATTGTAATAT GGCTGCTATT 14700 

AGCTTAATCA TGCATCAATT 14760 

TTAACCATTC AGACATGGAT 14820 

AATACAATAA AGTGATGTAT 14880 
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ATTGCAGAAG AATTTGGTTT ATTAACTTAT GTTGACGACG 
GGTAAAGGCG CTGGTACGGT TAAACATTTT GGTTTACAAG 
GGTACGCTTT CTAAAGCAAT TGGTGTCGTT GGCGGTTATG 
ATAGATTGGT TAAAAGCACA ATCACGACCA TTCTTATTCT 
GATACCAAAG CAATAACTGA AGCAGTTAAA AAGTTAATGG 
AAATTATGGA ACAATGCACA ATATTTAAAA AATGGATTGT 
GGTGAGTCAG AAACTCCAAT TACACCAGTA ATTATTGGTG 
TTTAGTAAGC GTTTAAAAGA CGAAGGTGTC TATGTGAAAT 
CCAAGAGGTA CAGGACGTGT AAGAAATATG CCTACAGCTG 
GATGAAGCAA TTGCGGCTTA TGAAAAAGTA GGAAAAGAAA 
TTTATTCCCA CGGCAAATAT TGTCGTGGGC TTTTTTTAAT 
(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 661 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



CTCATGGTTC AGGTGTTATG 15000 

ATAAAATCGA TTTCCAAATA 15060 

TAGCAGGTAC AAAAGAGTTA 15120 

CTACATCATT AGCACCTGGG 15180 

ATTCAACTGA ATTACATGAT 15240 

CAAAATTAGG ATATGATACA 15300 

ATGAAAAAAC AACTCAAGAA 15360 

CTATCGTTTT CCCAACAGTA 15420 

CACATACAAA AGACATGTTA 15480 

TGAAGTTGAT TTAATATTTA 15540 

GTTTAGTTTA TTAACAGT 15598 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

AAGTAAATCA ACTTACTGGG ATAAGAATAA AGGCGATTAT AGTAACAAGT TGATTTTATT 60 

CGAAAAACAT TTTGAACCGG TTCTGGGTAT CAAGATGCAA CATAGTGGAG GTCATAGCTT 120 

TGGCCACACG ATTATTACGA TTGAAAGTCA AGGAGATAAA GCAGTTCATA TGGGTGATAT 180 

ATTCCCAACT ACTGCACATA AAAATCCTCT ATGGGTAACG GCATATGATG ATTATCCTAT 240 

GCAATCGATT CGTGAAAAAG AACGCATGAT ACCATATTTT ATTCAGCAAC AATATTGGTT 300 

CTTGTTTTAT CATGATGAAA ACTACTTTGC TGTAAAATAC AGCGATAATG GTGAAAACAT 360 

AGATGCATAT ATTTTACGTG AAACATTAGT TGATAATAAC TAAAATAAAG ATGTATTACT 420 

AAACAAATTT TCAAAAATAA AAAATTGAGC CACATCCAAT CTTACTAATT AGGGTGTGGC 480 

TCATTTTTAA GTTTTACgAT CCAAATCAAA TATGGaTAAA ATTCgTATTA ACGCTCTACa 540 

ATGtTAATGA CTTCACCAGT ATATGCATCT GCATAAAAAT CATAATGAAT ATTTTGACCA 600 

TTTTTAATAG TTGTAATTCC ACCTTGATAA ACTAAACGGT ATTTATCAGT TTCAGGATGA 660 

A 661 
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{ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57.38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



{xi> SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

10 

GCAGACGGTA CAGCAGTTAA AGTCGCACCA AaACTGTAGT GAATcTAATC GGTGcATTCT 60 

TTTTAGGATT AGTTGTCGCG CTTATATATA TCTTCTTCAA AGTAATTTTC GATAAGCGAA 120 

TTAAAGATGA AGAAGATGTA GAGAAAGAAT TAGGATTGCC TGTATTGGGT TCAATTCAAA 180 

15 

AATTTAATTA AGGATGGTTG CTACTTATGT CAAAAAAGGA AAATACGACA ACAACACTAT 240 

TTGTATATGA AAAACCAAAA TCAACAATTA GTGAAAAGTT TCGAGGTATA CGTTCAAACA 300 

20 TCATGTTTTC AAAAGCAAAT GGTGAAGTAA AGCGCTTATT GGTTACTTCT GAAAAGCCTG 360 

GTGCAGGTAA AAGTACAGTT GTATCGAATG TAGCGATTAC TTATGCACAA GCAGGCTATA 420 

AGACATTAGT TATTGATGGC GATATGCGTA AgcCAACACA AAACTATATT TTTAATGAGC 4 80 

25 AAAATAATAA TGGACTATCA AGCTTAATCA TTGGTCGAAC GACTATGTCA GAAGCAATTA 540 

CGTCGACAGA AATTGAAAAT TTAGATTTGC TAACAGCTGG CCCTGTACCT CCAAATCCAT 600 

CTGAGTTAAT TGGGTCTGAA AGGTTCAAAG AATTAGTTGA TCTGTTTAAT AAACGTTACG 660 

on 

ACATTATTAT TGTCGATACA CCGCCAGTTA ATACTGTGAC TGATGCACAA CTATATGCGC 720 

GTGCTATTAA AGATAGTCTG TTAGTAATTG ATAGTGAAAA AAATGATAAr AATGAAGTTA 780 

AAAAAGCAAA AGCACTTATG GAAAAAGCAG GCAGTAACAT TCTAGGTGTC ATTTTGAACA 840 

35 

AGACAAAGGT CGATAAATCT TCTAGTTATT ATCACTATTA TGGAGATGAA TAAGTATGAT 900 

TGATATTCAT AACCATATAT TGCCTAATAT CGATGACGGT CCGACAAATG AAACAGAGAT 960 

GATGGATCTT TTAAAACAAG CGACAACACA AGGTGTTACA GAAATCATTG TAACATCACA 1020 

40 

TCACTTACAT CCTCGATATA CCACACCTAT AGAAAAAGTG AAATCATGTT TAAACCATAT 1080 

TGAAAGCTTA GAGGAAGTAC AAGCACTAAA TCTAAAGTTT TATTATGGTC AGGAAATAAG 1140 

45 AATTACCGAT CAAATCCTTA ATGATATTGA TCGAAAAGTT ATTAACGGTA TTAATGATTC 1200 

ACGCTATTTA CTAATAGAAT TTCCATCAAA TGAAGTTCCA CACTATACTG ATCAATTATt 1260 

TTTCGAATtA CAGAGTAAAG GCTTTGTACC GATTATTGCA CATCCAGAGC GGAATAAAGC 1320 

50 AATAAGTCAA AACCTTGACA TACTATACGA TTTAATTAAC AAAGGTGCTT TAAGTCAAGT 1380 

GACAACGGcG TCATTAGCGG GTATTTCCGG TAAAAAAATT AGAAAATTAG CAATTCAAAT 144 Q 
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GTTCTTAATG AAAGACTTAT TTAATGATAA GAAATTACGT GATTATTATG AAGATATGAA 1560 

CGGATTTATT AGTAATGCGA AGTTAGTTGT TGATGATAAA AAAATTCCTA AACGAATGCC 1620 

ACAACAAGAT TATAAACAGA AAAGATGGTT TGGGTTATAA ACAGCAAATG AGGGGTTTTA 1680 

TGGCACATTT ATCTGTGAAA TTGCGGCTTT TAATACTAGC ATTAATCGAT TCACTGATAG 1740 

TGACATTTTC AGTATTCGTA AGTTATTACA TTTTAGAACC GTATTTCAAA ACATATTCTG 1800 

TCAAATTATT AATATTGGCA GCTATATCAC TATTCATATC GCATCATATT TCaGCATTTA I860 

TTTTTAATAT GTATCATCGA GCGTGGGAAT ATGCCAGTGT GAGTGAATTG ATTTTAATTG 1920 

TTAAAGCTGT GACGACATCT ATCGTTATTA CGATGGTGGT CGTGACAATT GTTACAGGCA 1980 

ATAGACCGTT TTTTAGATTG TATTTAATTA CTTGGATGAT GCACTTGATT TTAATAGGTG 2040 

GCTCAAGGTT ATTTTGGCGT ATTTATCGGA AATACCTTGG AGGTAAGTCA TTTAATAAGA 2100 

20 AGCCAACTTT AGTTGTTGGT GCTGGTCAAG CAGGTTCAAT GCTGATTAGA CAAATGTTGA 2160 

AAAGTGACGA AATGAAACTT GAACCGGTAT TAGCAGTCGA TGATGACGAA CATAAACGCA 2220 

ATATCACAAT TACTGAGGGT GTAAAAGTCC AAGGTAAAAT TGCGGATATT CCAGAACTAG 22 80 

25 TGAGGAAATA TAAGATTAAA AAAATCATCA TTGCAATTCC AACTATTGGT CAAGAGCGTT 2340 

TGAAAGAAAT TAATAATATT TGCCATATGG ATGGCGTTGA GTTATTGAAA ATGCCAAATA 2400 

TAGAAGACGT CATGTCTGGT GAGTTAGAAG TGAACCAACT TAAAAAAGTT GAAGTAGAAG 2460 

ATTTACTAGG CAGAGATCCT GTTGAATTAG ATATGGATAT GATATCAAAT GAATTGACGA 2520 

ATAAAACTAT TTTAGTTACG GGTGCAGGTG GTTCAATAGG ATCAGAAATT TGTAGACAAG 2580 

TTTGTAATTT CTATCCAGAA CGTATTATTC TACTTGGCCA TGGTGAAAAC AGTATTTATT 2640 

TAATCAATCG TGAATTGCGA AATCGCTTCG GwAAAAATGT TGATATCGTT CCTATTATAG 2700 

CGGATGTGCA AAATAGAGCG CGTATGTTTG AAATTATGGA AACGTATAAA CCATACGCAG 2760 

TTTATCATGC AGCAGCACAC AAGCACGTGC CGTTAATGGA AGACAACCCT GAAGAAGCAG 2820 

TACGTAATAA TATTTTAGGT ACGAAAAATA CTGCTGAAGC TGCTAAAAAT GCAGAGGTAA 2880 

AGAAATTCGT TATGATTTCT ACGGATAAAG CCGTTAATCC GCCTAATGTC ATGGGAGCTT 2940 

CAAAGCGAAT TGCAGAAATG ATTATTCAAA GTTTAAATGA TGAAACGCAT CGAACAAATT 3000 

TTGTTGCAGT GAGATTTGGT AATGTACTTG GATCGAGAGG ATCTGTGATT CCACTTTTCA 3060 

AAAGTCAAAT TGAAGAAGGT GGGCCAGTTA CTGTGACACA TCCTGAAATG ACACGTTACT 3120 

50 TTATGACAAT TCCTGAAGCT TCTAGACTAG TTTTGCAGGC AGGGGCATTA GCAGAAGGTG 3180 

GCGAAGTATT TGTGCTAGAT ATGGGAGAAC CAGTGAAAAT TGTAGATTTG GCACGTAATT 3240 
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CCGGCGAAAA AATGTTTGAA GAGCTTATGA ATAAAGATGA GGTTCATCCT GAACAAGTAT 


3360 




TTGAAAAAAT 


TTATCGTGGC 


AAAGTACAAC 


ATATGAAATG 


TAATGAAGTT 


GAAGCGATTA 


3420 


5 


TTCAAGACAT 


CGTCAATGAC TTTAGTAAAG AAAAAATTAT TAACTATGCC AATGGCAAAA 


3480 




AGGGAGATAA 


TTATGTTCGA 


TGACAAAATT 


TTATTAATTA 


CTGGGGGCAC 


AGGATCATTC 


3540 


10 


GG TAATGCTG 


TTATGAAACA 


GTTTTTAGAT 


TCTAATATTA AAGAAATTCG 


TATTTTTTCA 


3600 


CGCGATGAGA AAAAACAAGA TGACATTCGA AAAAAATATA ATAATTCAAA ATTAAAGTTC 


3660 




f ACATTGGTG ATGTGCGTGA 


TAGTCAAAGT 


GTAGAAACAG 


CAATGCGAGA 


TGTTGATTAC 


3720 


15 


GTATTCCATG 


CAGCAGCTTT AAAACAAGTG 


CCGTCATGTG 


AATTCTTTCC 


AGTTGAGGCA 


3780 




GTGAAGACAA 


ATATTATTGG 


TACAGAAAAT 


GTCTTACAAA 


GTGCTATTCA 


TCAAAATGTT 


3840 




AAAAAAGTCA 


TATGTTTATC 


TACAGATAAG 


GCAGCGTATC 


CTATTAATGC 


TAGGGGTATT 


3900 


20 


TCAAAAGCAA 


TGATGGAAAA 


AGTATTCGTA 


GCCAAATCAA 


GAAATATTCG 


TAGTGAACAA 


3960 




ACGCTTATTT 


GTGGTACAAG 


ATACGGTAAT 


GTGATGGCTT 


CAAGAGGATC 


AGTAATACCT 


4020 




TTGTTTATCG 


ACAAAATCAA 


AGCTGGAGAA 


CCTTTAACGA 


TTACAGATCC 


TGATATGACA 


4080 


25 


AGATlTri'TAA 


TGAGCTTAGA 


AGATGCGGTA 


GAACTAGTTG 


TTCATGCATT 


TAAGCATGCA 


4140 




GAGACAGGAG 


ATATTATGGT 


TGAAAAAGCA 


CCAAGCTCAA 


CGGTAGGGGA 


TCTTGCGACC 


4200 




GCATTATTAG 


AATTGTTTGA 


AGCTGATAAT 


GCAATTGAAA 


TCATTGGTAC 


GCGACATGGA 


4260 


30 


GAGAAAAAAG 


CAGAAACATT 


GTTGACGAGA 


GAAGAATACG 


CACAATGTGA 


AGATATGGGT 


4320 




GATTATTTTA 


GAGTGCCGGC 


AGACTCCAGA 


GATTTAAATT 


ATAGTAATTA 


TGTTGAAACC 


4380 


35 


GGTAACGAAA 


AGATTACGCA 


ATCTTATGAA 


TATAACTCCG 


ATAATACACA 


TATTTTAACG 


4440 


GTGGAAGAGA 


TAAAAGAAAA ACTTTTAACA 


CTAGAATATG 


TTAGAAACGA 


ATTGAATGAT 


4500 




TATAAAGCTT 


CAATGAGATA 


GGAGAGATTG 


ACGTTGAATA 


TTGTAATTAC 


AGGAGCAAAA 


4560 


40 


GGTTTTGTAG 


GAAAAAACTT 


GAAAGCAGAT 


TTAACTTCAA 


CGACAGATCA 


TCATATTTTC 


4620 




GAAGTACATC 


GACAAACTAA 


AGAGGAAGAA 


TTAGAGTCAG 


CATTGTTGAA 


AGCAGACTTT 


4680 




GTCGTGCATT 


TAGCGGGTGT 


TAATCGACCT GAACATGACA AAGAATTCAG 


CTTAGGAAAC 


4740 


45 


GTGAGTTATT 


TAGATCATGT 


ACTTGATATA 


TTAACTAGAA 


ATACGAAAAA 


GCCAGCGATA 


4800 




TTATTATCGT 


CTTCAATACA 


AGCAACACAA 


GATAATCCTT 


ATGGTGAGAG 


TAAGTTGCAA 


4860 




GGGGAACAGC 


TATTAAGAGA 


GTATGCCGAA 


GAGTATGGCA 


ATACGGTTTA 


TATTTATCGC 


4920 


SO 


TGGCCAAATT 


TATTCGGCAA 


GTGGTGTAAG 


CCGAATTATA 


ACTCAGTGAT 


AGCAACATTT 


4980 




TGTTACAAAA 


TTGCACGTAA 


CGAAGAGATT 


CAAGTTAATG 


ATCGGAATGT 


TGAACTAACG 


5040 
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ATTGAAAATG GTGTACCTAC AGTACCAAAC GTATTTAAAG TGACATTGGG AGAAATTGTA 5160 

GATTTATTAT ACAAGTTCAA ACAGTCACGT CTCGATCGAA CATTGCCGAA ATTAGATAAC 5220 

5 

TTGTTTGAAA AAGATTTGTA TAGTACGTAT TTAAGCTATC TACCTAGTAC aGACTTTAGT 5280 

TAyCCCTTAC TTATGAATGT GGATGATAGG GGTTCTTTTA CAGAATTTAT AAAAACACCG 5340 

GATCGTGGTC AAGTTTCTGT AAATATTTCT AAACCAGGTA TTACTAAAGG TAATCACTGG 5400 

70 

CATCATACTA AAAACGAAAA ATTTCTAGTC GTATCAGGTA AAGGGGTAAT TCGTTTTAGA 5460 

CATGTTAATG ATGATGAAAT CATTGAATAT TATGTTTCTG GCGACAAATT AGAAGTTGTA 5520 

15 GACATACCAG TAGGATACAC ACATAATATT GAAAATTTAG GCGACACAGA TATGGTAACT 5580 

ATTATGTGGG TGAATGAAAT GTTTGATCCA AATCAGCCAG ATACGTATTT CTTGGAGGTA 5640 

TAGCGCATGG aAAAACTGAA rTTAATGACA ATAGTTGGTA CAAGGCCTGA AATCATTCGT 5700 

20 TTATCATCAA CGATTAAAGC ATGTGATCAA TATtTTAA 5738 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9062 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
ATCATCAACA AGAATGATAT TTTTCCCATC TACTATATCT TTTACCGCAG ATAACTTCAC 60 
TCTCACACCT TGCTCACGTA ATTCTTGAGT TGGTTGAATA AATGTTCTTG CAACATATTG 120 

35 

ATTTTTAACT AGTCCCATTT CATATGGCAA ACCTATTTCT TCAGCATAAC CACTCGCAGC 180 
TGATAGCGAT gAATTGGGTA CACCGATGAC CATATCAGCA TTTACAGGGC TTTCTTGGGC 240 
TAATTTTTTA CCAGAAGCTT TACGTACTGC ATGGACATTT TTACCAGCTA TTGTTGAGTC 300 

40 

TGGTCTAGCA AAATAAATAT ATTCCATCGC AGAAATTGCA GTTGTCGTAT GATGTGTATA 3 60 

AGATTTAACT GTAATACCTT TATCGTTAAT CACGACATAT TCACCTGCAT GAATATCTTG 420 

45 AACAAATTCT GCACCTAACA CATCTATTGC ACATGTTTCA CTTGCAAGGA TGTATGTCCC 480 

ATCTTTCATT TTACCTACAA CAAGTGGTCT GATAGCATTT GGATCTACTG CGCCATATAA 540 
CGCATCTTTA GTTAAAATCG CAAATGTAAA ACCGCCTTTA ACTTTTCGCA AACTTTCTTT 600 

50 CAACGCTTCC TCAAAAGTAG GAGCTTTACT TCGACGTATC AAATGCATAA TGACTTCAGT 660 

ATCAGAAGAC GAATGGAAGA TAGCACCTTG TTTTTCTAAA TTCTGACGCA ATGATTTAGC 720 
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CGGTTGAATA TTTTCAATAC CTTTATTACC TGAAGTAGCA TAACGGACGT GACCAATTGC 840 

ATGTTGATAT CCTTTTAATC GTTCCATTTG ATCATCTTTA ATCGCTTCAG TTAGTAAGCC 900 

5 TAATCCTCGC TCGCCTTTTA ATTCATTTTG ATCAGAAACA ACTATACCTG CACCTTCTTG 960 

ACCACGATGT TGCAAACTAT GAAGTCCCAT ATAtGTTAGT TGCGCTGCtT CaGGATGATT 1020 

CCAAATACCA AACACGCCAC ATTCTTCGTT TAATCCTGAG TAGTTAAACA TTGaGCAATT 1080 

10 

GCCCCtTCCC ATATTTGTTT AATATCTGAA ACATTTTCAC TAATCTCTGT aTATGGTGTT 1140 

GTTACCTTGr aATTATCACT ATCTGTTAAA AGTCCAATTT CTATTGCATT ATCAATATTT 1200 

AAAGTTTTAC CTGATTTAAC AGAAACAACA TATCGGCCTT GCGTCTCACT AAACAATTGT 1260 

15 

GCATTTGTTA TATCTATTGA AGATTTTAAT CCTAAACCGT AATGCGCACT TAGTTTAGCT 1320 

AAGGTAATCA GTAAGCCACC TTTACCAACT GTTTGAACAT GTGATAATAG TCCTTCACGA 1380 

20 ATAGCGGTCT TGATTGATTC ACCTTTTTCA ACTTCTGAAC TCAAATCTAA TGACTCAAAT 1440 

TCATGATTAA CTTTGCCATA AATTAACTTT TCAAGTTGAC TACCACCAAA GTCGTCCTTA 1500 

GTATCACCGA TTAAATATAA TTTATCTCCA ACTTGAGGTT CAAAATCATT TAAATAATTT 1560 

25 ACATTTTCAA TCAAACCTAC CATTCCAACA ACTGGTGTTG GGAAAATAGA AGTACCTTTC 1620 

GTTTCGTTAT ATAAAGATAC ATTACCAGAA ACTACTGGTG TCTTAAGAAT GTCGCATGCT 1680 

TCTGCCATAC CTTTCGTTGA ATCTATCAAC TGTTGATAGA TTTC T T TCTT TTCAGGAGAA 1740 

30 CCATAATTTA AACAATCTGT CATTGCTAAT GGTGTTGCAC CCACGGCAAT TAAATTTCGA 1800 

TAAGCTTCAG CTACTACCAT CTTTCCACCT TCATATGGAT TGTTATATAC ATAACGCGCT 1860 

TCACCATCAA TTGTTGAAGC AATTGCCTTA TTTGTGCCTT CCACACGTAC TACCGATGCT 1920 

35 

TGAAGTCCTG GCTTAATTAT CGTATTGGCA CCAACTTGTT GGTCGTATTG ATCATATAAA 1980 

TAGTGTTTAG ATGCTATAGT CGGATGCTTA AGTAATTTAA AGAAAGTATC TTTAACATCG 2040 

ATGTGTGTAT AATCATTTTT AGAAGTATTA TAATCTTTTT CTTCTCCTTC TAAAATATAT 2100 

40 

ACAGGTGCTT CATCAGCTAG TGGTTCAACT GGAATGTCAG CATAAACTTC GTCATCATAT 2160 

GTTAAAACAA AACGATTTGT ATCTGTAACT TCACCTATAA CAGCACTATC CAATTCGTGC 2220 

45 TTATCAAATA AATCTAAGAA TTTTTGTTCA GTACCTTTTT CAACAACTAG TAACATACGT 2280 

TCTTGAGTTT CTGAAAGCAT CATTTCATAA GGAGAAATAC CTGGCTCACG TGTTGGCACT 234 0 

TGTTCTAATC TCAAATGTAA CCCACTACCA CCTTTTGCCG CCATTTCAGA CGATGAAGAT 2400 

50 GTTAAACCAG CAGCACCCAT ATCTTGAATA CCAACTAATT CATCAAATGT AATTGCTTCA 2460 

AGTGTTGCTT CCATTAATTT TTTACCTACA AATGGATCAC CGATTTGTAC AGAAGGTCGT 2520 
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CGACCAGTTT TCAAACCAAC ATAAATGACC 
TGAATCATGT CGTGATTGaT AACACCAACA 
5 TAACGTTCA7 CAAATTCGAT TTCACCAGCA 
CCTCCGATAC CCTTTACAAC ACCTTTAAGT 
CCAAATCTAA GACTGTTTAA CAAATTAATA 

10 

ATGATTCCAC CAACGCCTGT AGCAGCCCCT 
TGAGACTCTA CTTTAAATAC TACGGCTTGA 
TCACCAGGCC CCATAAGCAC ATGGTcACCT 

75 

GAATGTTTAT AAGAGCAATG TTCACTCCAC 
TTAGGTTGTC TGCCTAAAAT ATCGCAAACT 

20 TCTTGATATA CTTTTTCAAG TTTAATTTCT 
TTGTTCCCTC CAACTTTTTA CCATCGCTTC 
CAACGTTTCT AAAGCTCTTT CagGATGtGG 

25 AACAATTCCT GCAATATCAT CATATGAACC 
TTGATTGTTA GCTTTTAATT GTTGATATAT 
GTGAGCTACA GGATATATAA CTTTTTCACC 

30 ATTATTCACT ATTTCTAACT CTTCATTTCT 
TAATGCACCA GGTAATAAGC CTATTTCAGT 
TACTGGCTTA CCTTCAGCTG CAAGACGTTT 

35 

CATTGCCCCA GATCTTAAGT AATCCCCGAA 
AAATGCACTT AGTGATGTTT CTCTATAATC 
AGCAGCATTA AACATGTCTC TATCACAATT 

40 

CATTTTATGC ATTCTCCTTT TCATCATCTA 
TATTTGCAAA CAATTTTTCA CTTAGAGTTG 
CCTCATCCAC TGTCATATAT AATACTTTTC 

45 

CTAAGTCATG TACAGCTCGA GTAAGCGTTT 
ATGTGATATG TAGTTCAATT GTTTTCATTA 
50 TTGATATGTT TCAATCAGTG ATCCAGTGTT 
ATTGGTAGCT TTATCCCAAA TTCGACATGT 



GAATTACCTA CACCTTTTGC TGTGCCTTTT 2640 

CACATTGCAT TAACAAGTGG ATTGCCATCA 2700 

GTTGTTGGaA TACCAATGCA GTTACCATAA 2760 

AATCTTTGGT TTTGTTTATT ATCTAATTCT 2820 
GGTCTAGCCC CAATAGAGAC AATGTCACGA . 2880 

TGATATGGTT CAATTGCTGA TGGATGATTG 2940 

TTATCACCTA TATCGACTAC CCCTGCACCT 3000 

GACGTAGGAA ATTGCTTTAA AAACGGTTTA 3060 

ATAACAGAAA AGATACCTGT TTCTGTAAAG 3120 

TTTTCATATT CTTGATCaCT TAATCCCATA 3180 

TCAACGCTTG GTTCGATAAA TTTAGACATG 3240 

AAATAATTTC ACACCACTAT CAGTACCTAA 3300 

CATCATGCCA CATACATTGC CTTTTTCGTT 3360 

GTTCGGATTA TTCACATATT TCAGAATAAT 3420 

TTCATCAGTA CAATAATAAT GACCTTCACC 34 80 

TTGTTCATAA AGATTTGTAA ATGCCGTTTG 3540 

ACTAATAAAT AAATGTGAAT CGTTATGCAA 3600 

TAAAATTTGA AACCCATTAC AAACACCTAA 3660 

AACTTCCGAA ATAATCGGsG CTACACTAGC 3720 

TGAAAATCCA CCAGGAATAA GTACGCCATC 3780 

TACATATTCC GCTTCAACAC CACTTTTAAT 3840 

CGAACCTGGA AAAACAAGAA CCGCAAATTT 3900 

ACACTTTATA GCTATATTCT TCAATCACTG 396 0 

TAATAATGTT GTGTACCTTT TCATCACTAA 4020 

CTACACGAAT ATCATTCACT TGTGCATAAC 4080 

GTCCTTGCGT ATCTAATACT TGTGGTTGTA 4140 

TTTTAAATCC TCCAATTTGT TTAAAAATAT 42 00 

ATTTCTATAT ACATCTTTAT CAAAGTTTGC 4260 

ATCTGGAGAT ATTTCATCCG CTAACAAAAT 4320 
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ATCCATTAAT 


TGTTTCAACA CATTATTAAT 


CTTTAATGCT 


TTGGATTTTA 


GTATTTCAAT 


4440 




ATCTTCATCT GATGCTATAT TGAGCAATTT AACATGGTCA TCCGTTATCA ACGGATCATT 


4500 


£ 


TAACGCATCA TTTTTATAGA AAAATTCTAC AAGTGGTTCT CTAAAAACTT CACCATTTTC 


4560 




AAAACCTAAA CGCTTTGTAA TAGATCCACT AGCAATATTA CGAACAACTA CTTCTAATGG 


4620 


10 


AATTATTTTC 


ACAGGCTTAA CTAATTGTTC 


TGTTTCAGAT 


AATTGTTTAA 


TAAAGTGACT 


4680 


TTCTATTCCA 


TTTTCTTGTA AATATTTAAA 


TATAATAGAA 


GTAATTTGAT 


TATTTAATCG 


4740 




CCCCTTACCT 


GCCATTGTGT CTTTCTTAGC 


CCCGTTTCCA 


GCAGTAACTT 


CATCTTTATA 


4800 


15 


TTCAACTCTT 


AATTCATTTT CTTGATTTGT 


TGAGAAAATG 


CGcTTCGCTT 


TTCCTTCATA 


4860 


TAATAATGTC 


ATGCTTTAAT TACTCCCCTC 


AAATTTAGCG 


TACATATCTT 


GTTCAGTTTG 


4920 




GTTTACATCA 


TTCGTTAGTA CAGTCATATG 


CCCCATTTTT 


CTGCTATCTT 


TACGCTCAGA 


4980 


20 


CTTACCATAA 


ATATGTAAGT GCCACTCTGG 


ATGTTCATTA AATTCATTTT 


CCAATAAATC 


5040 




TAAATCTTTA 


CCTAGTAAGT TCATCATGAC 


TGCTGGCTTT 


AATAATTCAA 


TTGAATTTGG 


5100 




TAATGATTGT 


CCGGTAACTG CTAAAATATG 


AGTATCAAAT 


TGTGAATAAT 


CACATGCTTC 


5160 


25 


AATTGAATAA 


TGTCCGGAAT TGTGAGGCCT 


TGGTGCTATC 


TCGTTCACAT 


ACAATTGGTT 


5220 




GTTACTATCT 


ATAAAAAATT CAACTGTAAA 


TGTTCCAATG 


AAATGAATCG 


ATTGGATAAT 


5280 




TTTATTAACT 


TGCTCTTTCG CCTCAGCTGT 


TTTATCTATT 


CTCGCTGGAA 


CAATTGTTTT 


5340 


30 


GAAAAGTATT 


TGATTTCTAT GCTCATTTTC 


TTGTAATGGG 


AAAAAAGTGA 


TTTGATTGTT 


5400 




GTTTCCTCTT 


GTAACAGTAA GAGATACTTC 


TTTCTTGATA 


TTCAAATATT 


TTTCAGCTAC 


5460 




GCATTCACTA 


GTTTCAATTA ATTTAAAACC 


TTCTTGTAAG 


TCTTTTTCGT 


TGTTAATTAA 


5520 


35 


AACTTGACCT 


TTGCCATCGT AGCCACCAAA 


TCTAGTTTTT 


ACAATAAAAG 


GATATCCTAA 


5580 




TGTl^CAATT GCTTTGTCAA TATCTGTAGA 


TTCTTTTACT 


GAAATGAACG 


GGACAACTTT 


5640 


40 


GGTACCAGCA 


CTTTTTAATG TTTCTTTTTC AGTTAAGCGA TCTTGTAATA ACTGTATAGC 


5700 


TTGGTAACCT 


TGCGGAATAT TGTACTTTTC 


ACATAATAGT 


TTTAATTGTT 


GGGCTGAAAT 


5760 




GTTTTCAAAT 


TCATAAGTAA TCACATCACA 


TTTTTGTCCT 


AATTGATTGA 


GTGCCTTTTC 


5820 


45 


ATCGTCATAC 


TTGGCTTGTA TAAATTCGTG 


TGCAACGTAT 


CTACATGGAC 


AATCTTCAGA 


5880 




AGGATCCAAT 


ACAACCACTT TATAACCCAT 


TTTTTGAGCT 


GATTGTGCCA 


TCATCTTTCC 


5940 




AAGCTGACCA 


CCACCAATAA TGCCAATAGT 


CGCACCAAAC 


TTTAATTTAT 


TGAAGTTCAT 


6000 


50 


TTTGCATGTC 


CTCCACTTTT TGAATTAACG 


AAGATTCATA 


CTGATTTAGT 


TTTTCAACTA 


6060 




AAGAAGGATT 


TTGAATACTT AACATTCTTG 


CTGCAAGTAT 


ACCTGCGTTT 


TTAGCACCTG 


6120 
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15 



20 



AAGAATCTAT ACCCTTTAAA CTTTTTGTTT CAATCGGCAC TCCAATAACT GGTAGCGTCG 624 0 

TTAATGATGC AACCATACCT GGTAAATGTG CCGCACCGCC AGCGCCTGCA ATGATAATGT 6300 

TTATACCTCT TTCTCTCGCT TCAGAAGCAA ATTGAACCAT CATTTTTGGC GTACGATGTG 6360 

CGGATACTAC TTGTTTTTCG TACGGAATTT CAAAATAATC CAACATGTTA CAACTCTCTT 6420 

GCATAATTTT CCAATCGGAA GAACTGCCCA TAATGACTGC TACTTTCACT TTGTACACCC 6480 

TTTCAAAAGT TTGAATTGTG AATTACTTTA GTTGTATATT ATAGATATAG CATAACAAGC 654 0 

AATTTCTGCT TTTTCAATCA AAAATCGAAC TTTATTTTGA TTTTTTATTT GAATTTACGT 6600 

CTTTTGCTAT GTAAATTAGT TTTATAAACT AACAAAGTTA GGATATTGAC AATAGGAGGA 6660 

GAAGTTTTTA TGGTTGCTAA . AATTTTAGAT GGTAAACAAA TTGCCAAAGA CTACAGACAG 6720 

GGGTTACAAG ATCAAGTTGA AGCGCTAAAA GAAAAGGGTT TTACACCTAA ATTATCCGTT 6780 

ATATTAGTTG GTAATGATGG CGCTAGTCAA AGTTATGTTA GATCAAAAAA GAAAGCAGCT 684 0 

GAAAAAATTG GTATGATTTc AGAAATCGTA CATTTGGAAG AAACAGCTAC TGAAGAAGAA 6900 

GTATTAAACG AACTAAATAG ACTAAATAAT GATGATTCTG TAAGTGGTAT TTTGGTACAA 6960 

25 GTACCATTAC CAAAACAAGT TAGCGAACAG AAAATATTAG AAGCAATCAA TCCTGAAAAA 7020 

GATGTGGACG GTTTTCATCC AATAAATATA GGGAAATTAT ATATCGATGA ACAAACTTTT 7080 

GTACCTTGCA CACCGCTCGG CATCATGGAA ATATTAAAAC ATGCTGATAT TGATTTAGAA 714 0 

GGTAAAAATG CAGTTGTAAT TGGACGAAGT CATATTGTCG GACAACCAGT TTCTAAGTTA 7200 

CTACTTCAAA AAAATGCATC AGTAACAATC TTACATTCTC GTTCAAAAGA TATGGCATCA 7260 

TATTTAAAAG ATGCTGATGT CATTGTCAGT GCAGTTGGTA AGCCTGGTTT AGTAACAAAA 7320 

GATGTGGTCA AAGAAGGAGC AGTAATTATC GATGTTGGCA ATACGCCAGA TGAAAATGGC 73 80 

AAAXTAAAAG GTGACGTTGA TTATGATGCG GTTAAAGAAA TTGCTGGAGC TATTACACCA 744 0 

GTTCCTGGTG GCGTTGGTCC ATTAACAATT ACTATGGTAT TAAATAATAC TTTGCTTGCA 7500 

GAAAAAATGC GTCGAGGTAT TGATTCGTAA AGAGCCTGAG ACATAAATCA ATGTTCTATG 7560 

CTCTACAAAG TTATAATGGC AGTAGTTGAC TGAACGAAAA TTCGCTTGTA ACAAGCTTTT 7620 

TTCAATTCTA GTCAACCTTG CCGGGGTGGG ACGACGAAAT AAATTTTACG AAAATATCAT 76 80 

TTCTGTCCCA CTCCCTAATA ACTGAGTTTT AATGAAGTCT TTTAACCCAC ATTAAATATT 774 0 

ATTTTGCAAT TGCAATGAAT AACAAGAAAA ATCTGGGACA TTAATCGATC AAATGCTCCC 7800 

SO TTCAAAGTAG ACATTGAATA AATGAAGGCT TTGAAGGGAG CATTTCACTT TGTACTTGGC 7860 

TCAACAATTT TATATAGACA GTAGTTAATT GAATGAAAAT AAGCTTGTAA CAAGTTTTCA 7920 
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15 



GTTGGGGATG GGCCCCAACA CAGAAGCTGT GACTATGATA AAGTACTACT ACATAGTTAA 804 0 

TCATTAGTGG TTCTTTATCA TTTTCGCCTC CCTTTTCTTA TTGTTTTGAT ACACAAAAAT 8100 

TTAAGTTCAA ACTGTCGAAT AAAGTTATAT TTGATTTCAA ATTATCCCTA AATTATTAAT 8160 

TkTACAATTG TGGCAGATTT TCAAAATAAT AATTATTTCC TCATTATTTA TAAATTTATA 8220 

TTTAAATTTC ATTCTTTATA GGGTAAGATT AGGACTATAG TATGATGTGT ArATAATATA 8280 

AATTAAGGTA TAGTAAAGCT AACTCAGAAA . TGACTTATCA TTCGGAGGTT ACATTATGAA 834 0 

TAAACTATTA CAGTCATTAT CAGCCCTCGG TGTTTCTGCT ACACTAGTAA CACCAAATTT 8400 

AAATGCAGAT GCAACGACGA ATACTACACC ACAAATTAAA GGCGCTAATG ATATCGTTAT 8460 

TAAGAAAGGT CAAGATTATA ACCTTCTAAA CGGCATAAGT GCATTTGATA AAGAAGATGG 8520 

AGATTTAACC GATAAAATTA AAGTCGATGG CCAAATTGAT ACATCTAAAT CTGGTAAATA 8580 

20 TCAAATTAAA TATCATGTCA CTGATTCAGA TGGTGCAATT AAAATTTCCA CTAGGTATAT 8640 

TGAGGTTAAA TAGCCCTCAT CACTATACTG CAAATAAAAT GGTAGCAAAC GAACATGTTT 8700 

TGCTACCATT TTATTTGTTA TTCTAACTTC ATCTGCAACT TTAACCCAAA TATTGTATTT 8760 

25 TTTCTGTATA CCAAAGGACT ACCTATCAAA TTATTAAAAC TTAACTGCTC TTTTTAAAAA 8820 

AATGTTTTGA TTTTGAACAA ACAAATTTCC ACTTTTCATT GTTTAACGAT AAATTACTTT 8880 

TGGCAAATTC CTTATTAAAA TGTTTGCGCT TCCTTTCAAT CAACTAGCCA TCATTTTCAA 894 0 

TTTATTAGAC AATTTCAAAC TTTTTTTATT TTCATTCAAT TAACCTTTAA TTGAAAGCTA 9000 

TTCTCAACTT TCCTTTTAAA TATGAAGCAA TTTTTTCAAA AACGCTATTA GTCACAAAAT 9060 

GT 9062 
(2) INFORMATION FOR SEQ ID NO: 86: 

~{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2738 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 



50 



AAATATTTTT 


TCAAAACTAT 


GTGAAAATGG 


aCCATGTCtA 


aATCATGTAA TAATGCAGyA 


60 


CATAATGCCA ACGGTCTmTC 


TTTATTGTCC 


CATGCATCAT 


GACCAATAAA TGACTCATCA 


120 


ATTAATCGTC 


TAACTATTTC 


ATACACACCT 


AAAGAATGTC 


CAAAGCGACT ATGTTCTGCT 


180 


GTGTGAAAAG 


ATAGGTACAG 


TGTTCCTAGT 


TGTCTAATTC 


GACGTAACCT TTGGAATTCC 


240 
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TCTTTAAAAA CTTTTTCTTC TACTAATTTT AAATCTACAT ATGCGTTAGT CATTATTCCC 360 

CTCCTTTTCG TTTAATATAA TATTTAATTT ACTTAAAATG CTTTGTACAT AAGTGCTAAG 4 20 

TCTAACTTTT CGCCATACAT TTCTGGCTCA TAAGAGCGTA AGATTGTAAA ACCTTGCTCT 4 80 

TTATAGTAAG CTACTGCTTC TTCATTTTTA TTATCTACTT CTAAGTAAAC ACCTTCAAAT 540 

TTATCTTCAA AACGTGATAA TCCTTCATTT AACAATGCTG TACCATAACC TGTATGTTGC 600 

GATTCTGGTT TAACATAATG AGCTGATAAA TATAATTCTT CACCGTAAAT AAAGTTAGCA 660 

AAGCCAACGA TGTCATTACC TTCTTCAACG ACTAAGAATA ATTGTTCTTG AAGTCTTTTC 720 

TTTAAATGAT GTTCATTATA TGAAGCTtCT AACAAGTGAT TAACTGTTGT CGCAGCG TAT 780 

ATATTTAAGT ATGTATTAAA CCAAGCTTTA GTTGCGACAT CTCTAATTTG AACAACATCT B40 

TTTTCAGTTG CTTGTCTTAC CTTGAACATG ACTTTCTCCC CTTATTAACA AGTTTTAATA 900 

ACGGCATTAT ACCACAACTT GCTCAATACT TAATAAACAA TGATTGTCTA TTCAATTTAT 960 

ATATtTATAT TTTCCGTTAA AATTAAAAAT AAAAAATAAC GAAGCAAAAA AtCACTTCGT 1020 

TTAGTATGAG GTATGTCTTA TTGCAATATA CTATTCCACT CAGTTGCACG TGCTAAGGCA 1080 

2s TAGTTGTCTT TCATGATGTC ACCAGGCTTT TCAGCAGTTC CAATAATATA ACCATTTAAA 1140 

GTGGCACCTA rAAAGTCTAA ACTATATTTC ATTTGCGTAA TTGCTGGTTC GCTTTTATTT 1200 

TTGGACAATC TCCACCAACT AAAATAACTC TAAAATCCTT TTCGGCCATT TGTGCCTTAA 1260 

30 AATTAGGATA TCGTTTATCT TGTAATGTTT CTGACCAATG TTCGATAAAT GCTTTCAATG 1320 

GTGCTGAAAT GCTATACCAA TACACTGGTG ATGCAAAAAT AATTGTATCA CTAGCCAATA 1380 

TTTTATCTAG AATCGGCAAA TAGTCATCGT CATATGAAGT AATAGTCTCT GCTGTATGTC 144 0 

TCACGTCACG TATCGGTTTA AACTGATGTT GTGTCACGTC AATCCATTGA TACTCTAAAT 1500 

CTTGCAAAGC GAATTTTGTT AATTGTGCAG TATTACCGTT TGGTCTACTC CCACCAAACA 1560 

AAACAGTAAT CATTTTAGCC TAACCTCACT TTTGATTAAT AAATATCTGT GTTTTTCGTT 1620 

ACCTAATTAT ACTATCATAA GCTTTGCCTA CCGAATAGTA AAACGCTTAC AACTTTTATA 16 80 

TAAATTTGAC GAAATTTCGT CATGCCTTAT ATAACGTCGT TTGTGATACG GGGCTAATTC 174 0 

ATGATGAAAT TAGATACATA TATCACCATT AAATACAATT CATTTAGTCT TCAATCGGAA 1800 

ACAGTTCATC GATATATTGA ATCTCATCAT CTGATAAAAC GATATCTGCA GCTTTAATAT 186 0 

TTTCAACGAC TTGTTCTGCA CGTTTTGCAC CAGGAATAAT CACATCGATA GCTGGTCTCG 1920 

SO TTAAATAAAA TGCTAATACA ATGTTCGCAA TTGAAGTTTG ATGTGCTGCA GCTATGCTTT 1980 

CCAAAGCTTT TACGCGACGC ACATTTTCTT CAAATACACC TGGTTTAAAA TCACGACGTG 2040 
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GCTAATGGGA AATATGGAAT AAATGTGATT TGGTGATCAA CACAATATTG TAATACTGCC 2160 

TCATTTTCGC GATGCAATAA ATTATATTCT AACTGTACAA CATCAACGTA ACCATCTTTA 2220 

5 TTTGCTTCTT TAAGTTGATC TAATGTGAAA TTTGATACAC CAATTGCTTT AATCTTCCCT 2280 

TGTTCCTTAA GCTCTTGTAA TGCTGCAACT GCTTGATCTT TCGGAGTGTT GTTATCCGGA 234 0 

AAATGAATAT AATATAAATC GATATAATCA GTTTGTAGAC GTTTCAAACT ATTCTCAACT 2400 

10 

TGTTGTTTTA AATATTCCGG TTGATTGTTC TGATGTACTT CTTGATTTTC ATCAAATTCA 2460 

TGAGACCCTT TCGTAGCAAT TTTAATTTGC TCTCGCGGAT ATTCTTTAAC AACTTCTCCA 2520 

ACCAATTCTT CTGATCGTTC TGGCCCATAA ATATATGCCG TATCTAATAA ATTAATACCA 2580 

15 

TGATTAATGG CTTGACGAAC AACATCTTTT CCTTGTTCTT CATCTAAGTT CGGATATAAA 264 0 

TTATGCCCAa CCTAtGCGTT CGTCCCAAGT GCGATTGGAA ACACTTCAAC ATCAGATTTA 2700 

20 CCTAAGTTTA CAAATTGCTn CATTAGACCC AGCnCCTT 2738 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9425 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
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GATTAGATGA 


TATTTAACGA 


AAATTAaGrT 


GmAATACTtG 


AATGTArGAa 


GTCTGATGTC 


60 


GAAAATAGCT 


ATTAAAATAG 


AGTAGACGTA 


ATGtAAATGA 


AAGCACCTAA 


AATAGAAAAA 


120 


TTTCAAAAAT 


AGCGTAATTA 


TTATAATAAA 


TAGACTGCCA 


ATAAAATGCA 


ATTTTTCACT 


180 


TATAACATTC TTGAAAAAAT AATAGCAAAA 


TTATGTAAAA AATATCTTGT 


CATGGCAAGA 


240 


TTGGCTGTGC 


TATAATCTAT 


CTTGTGCTTA 


AGAACGGCTC 


CTTGGTCAAG 


CGGTTAAGAC 


300 


ACCGCCCTTT 


CACGGCGGTA 


ACACGGGTTC 


GAGTCCCGTA GGAGTCACCA 


TTTTTTAGGT 


360 


CTCGTAGTGT 


AGCGGTTAAC 


ACGCCTGCCT 


GTCACGCAGG 


AGATCGCGGG 


TTCGATTCCC 


420 


GTCGAGACCG 


TACAAATGCC 


TATCCAAGAG 


GATAGGCATT 


TTTTTGCGTT 


TAATATTATA 


480 


TTAATAAAAG 


ATATATGGAC 


GAATGATAAT 


CATATTGATT 


TATCTGTTCG 


TCCATTTTCT 


540 


TTAAAATGTA 


TGAACCTCAA 


GTAACTTAGT 


GGTTGGATAT 


GAAAGATAAA 


CGTAGACAAT 


600 


AAAATCTTTA 


TTAGACGTAC 


AAACATATGC 


TACTGTCAAC 


ATATTTCTTC 


GTTGTGATAT 


660 


GCCACCAGTC 


CTCCATAACA 


TCAATTGTTA 


AAGTAACGAA 


TAACGAATAA 


TGATATTTAT 


720 
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GACCTCATCA TTGTGTTAAA TATCATTGTC ACAATCCGCC GTGAGAAACT AATAAAAAAT 640 

AGTAATATAT AAGTTTATAT TGGAAAATAG AATTAATAGC TTATAAATGG TAAATTATAT 900 

5 AATAGGTTAC TATACGTTAT AAGACGGAAA ATGCGCACAA TAACAAAAAT AGTAAGCGAC 960 

ATCCTGTGAT TTTTTACACA AACATAAACG ATAAAGAACA AAAAATGATA AAATAATATT 1020 

AATGATTTAA GAAAAGAGGT TTATGCAAAT GGCTAGAAAA GTTGTTGTAG TTGATGATGA 1080 

10 AAAACCGATT GCTGATATTT TAGAATTTAA CTTAAAAAAA GAAGGATACG ATGTGTACTG 1140 

TGCATACGAT GGTAATGATG CAGTCGACTT AATTTATGAA GAAGAACCAG ACATCGTATT 1200 

ACTAGATATC ATGTTACCTG GTCGTGATGG TATGGAAGTA TGTCGTGAAG TGCGCAAAAA 1260 

15 

ATACGAAATG CCAATAATAA TGCTTACTGC TAAAGATTCA GAAATTGATA AAGTGCTTGG 1320 

TTTAGAACTA GGTGCAGATG ACTATGTAAC GAAACCGTTT AGTACGCGTG AATTAATCGC 1380 

ACGTGTGAAA GCGAACTTAC GTCGTCATTA CTCACAACGA GCACAAGACA CTGGAAATGT 1440 

20 

AACGAATGAA ATCACAATTA AAGATATTGT GATTTATCCA GACGCATATT CTATTAAAAA 1500 

ACGTGGCGAA GATATTGAAT TAACACATCG TGAATTTGAA TTGTTCCATT ATTTATCAAA 1560 

25 ACATATGGGA CAAGTAATGA CACGTGAACA TTTATTACAA ACAGTATGGG GCTATGATTA 1620 

CTTTGGCGAT GTACGTACGG TCGATGTAAC GATTCGTCGT TTACGTGAAA AGATTGAAGA 1680 

TGATCCGTCA CATCCTGAAT ATATTGTGAC GCGTAGAGGC GTTGGATATT TCCTCCAACA 1740 

30 ACATGAGTAG AGGTCGAAAC GAATGAAGTG GCTAAAACAA CTACAATCCC TTCATACTAA 1800 

ATTTGTAATT GTTTATGTAT TACTGATTAT CATTGGTATG CAAATTATCG GGTTATATTT IB 60 

TACAAATAAC CTTGAAAAAG AGCTGCTTGA TAATTTTAAG AAGAATATTA CGCAGTACGC 1920 

35 GAAACAATTA GAAATTAGTA TTGAAAAAGT ATATGACGAA AAGGGCTCCG TAAATGCACA 19 BO 

AAAAGATATT CAAAATTTAT TAAGTGAGTA TGCCAACCGT CAAGAAATTG GAGAAATTCG 2040 

TTTTATAGAT AAAGACCAAA TTATTATTGC GACGACGAAG CAGTCTAACC GTAGTCTAAT 2100 

40 

CAATCAAAAA GCGAATGATA GTTCTGTCCA AAAAGCACTA TCACTAGGAC AATCAAACGA 2160 

TCATTTAATT TTAAAAGATT ATGGCGGTGG TAAGGACCGT GTCTGGGTAT ATAATATCCC 2220 

AGTTAAAGTC GATAAAAAGG TAATTGGTAA TATTTATATC GAATCAAAAA TTAATGACGT 2280 

45 

. TTATAACCAA TTAAATAATA TAAATCAAAT ATTCATTGTT GGTACAGCTA TTTCATTATT 2340 

AATgCACAGT CATCCTAGGA TTCTTTATAG CGCGAACGAT TACCAAACCA ATCACCGATA 24 00 

50 TGCGTAACCA GACGGTCGAA ATGTCCaGAG GTAACTATAC GCAACGTGTG AAGATTTATG 2460 

GTAATGATGA AATTGGCGAA TTAGCTTTAG CATTTAATAA CTTGTCTAAA CGTGTACAAG 2520 
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GTGATGGTAT TATTGCAACA GACCGCCGTG 
TCAAGATGCT TGGTATGGCG AAAGAAGACA 

5 

GTCTTGAAGA TGAATTTAAA CTGGAAGAAA 
ATTTAAATGA AGAAGAAGGT CTAATCGCAC 
CAGGATTTGT AACTGGTTAT ATCGCTGTGT 

10 

AACGTGAGCG TCGTGAATTT GTTGCCAATG 
CTATGAATAG TTACATTGAA GCACTTGAAG 

15 CACAATTTTT ATCTGTTACC CGTGAAGAAA 

TGCTACAGTT ATCTAAAATG GATAATGAGT 
TTAACATGTT CATTAATAAA ATTATTAATC 

20 TTATTCGAGA TATTCCGAAA AAGACGATTT 

AAGTATTTGA TAATGTCATT ACAAATGCGA 
AGTTCCACGT GAAACAAAAT CCACTTTATA 

25 GCATTGGTAT TCCTATCAAT AAAGTCGATA 

AGGCACGTAC GCGTAAAATG GGTGGTACTG 
TGGAAGCGCA CAATGGTCGT ATTTGGGCAA 

30 

TTATCACACT TCCATGTGAA GTCATTGAAG 
ATTAAATCTG TCATTTTAGC ACTACTCGTC 
TGGAACTTTT CTCCTGATAT TGCAAATGTC 

35 

rAACCTTTAA CGACACCTAT GACAGCCAAA 
ATTCATTCGA AAAATGATCA TCCAGAAGGA 
CTGACGAAAC CTTTGAAAAA TAAAGAAGTG 

40 

AACTTGATGA TTCCTGATTT GAACAGTGAT 
CCGTTATCAA CATATCTTGG TCAAGTACTG 

45 AATTTCAATC GTTTGGTCAT AGATCATGAT 

AGCAAAGATC GCCACGATTA CGTAAAATTA 
GATGCATTAG CAGCAGTGAA AAAAGATATG 

50 GATACAATTG ATCGTACGAC GCATGTTTTT 

TATCGCATGG TATTTAACAC GATTAGTGTT 
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GACGTATTCG TATCGTCAAT GATATGGCAC 264 0 

TCATCGGATA TTACATGTTA AGTGTATTAA 2700 

TTCAAGAGAA TAATGATAGT TTCTTATTAG 276 0 

GTGTTAACTT TAGTACGATT GTGCAGGAAA 2820 

TACATGACGT AACTGAACAA CAACAAGTTG 2880 

TATCACATGA GTTACGTACA CCTTTAACTT 2940 

AAGGTGCATG GAAAGATGAG GAACTTGCGC 3000 

CAGAACGAAT GATTCGACTG GTCAATGACT 3060 

CTGATCAAAT CAACAAAGAA ATTATCGACT 3120 

GACATGAAAT GTCTGCGAAA GATACAACAT 3180 

TCACAGAATT TGATCCTGAT AAAATGACGC 3 24 0 

TGAAATATTC TAGAGGCGAT AAACGTGTCG 3300 

ATCGAATGAC GATTCGTATT AAAGATAATG 3360 

AGATATTCGA CCGATTCTAT CGTGTAGATA 3420 

GATTAGGACT AGCCATTTCG AAAGAGATTG 34 80 

ACAGTGTAGA AGGTCAAGGT ACATCTATCT 354 0 

ACGGTGATTG GGATGAATAA TAAGGAGCAT 3600 

TTGATGAGTG TCGTATTGAC ATATATGGTA 3660 

GACAATACAG ATAGTAAGAA GAGTGAAACG 3720 

ATGGATACAA CTATTACGCC ATTTCAGATT 3 780 

ACGATTGCGA CGGTATCTAA TGTGAATAAA 3840 

AAGTCCGTGG AACATGTTCG TCGTGATCAT 3 900 

TTTATATTAT TCGATTTTAC GTATGATTTA 3960 

AACATGAATG CGAAAGTACC AAATCATTTC 4 020 

GCTGATGATA ATATCGTGCT TTATGCTATA 4080 

ACAACTACAA CGAAAAATGA TCATTTTTTA 4140 

CAACCATACA CAGATATCAT CACAAACAAA 4200 

GCACCAAGTA AACCTGAAAA GTTAAAAACA 42 60 

GAGAAAATGA ATGCTATACT ATTTGACGAT 4 3 20 
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GCAAACTATA ACGATAAAAA TGAAAAATAT CATTATAAAA ACCTGTCCGA AGATGAAGCG 444 0 

AGTTCCAGCA AAATGGAAGA AACGATTCCA GGAACCTTTG ATTTTATTAA TGGTCATGGT 4500 

GGTTTCTTAA ACGAAGACTT TAGATTGTTT AGTACGAATA ATCAGTCAGG CGAGTTAACA .4560 

TATCaACGTT TCCtTAATGG TTATCCAACG TTTAATAAAG AAGGTTCTAA TCAAATTCAA 4620 

GTCACTTGGG GTGAAAAAGG CGTCTTTGAC TATCGTCGTT CGTTATTACG CACCGACGTT 4680 

GTTTTAAATA GTGAGGATAA TAAATCGTTG CCGAAATTAG AGTCTGTACG TTCAAGCTTA 4740 

GCGAACAATA GTGATATTAA TTTTGAAAAA GTAACAAACA TCGCTATCGG TTACGAAATG 4800 

CAGGATAATT CAGATCATAA TCACATTGAA GTGCAGATTA ACAGTGAACT CGTACCGCGT 4860 

TGGTATGTAG AATATGATGG CGAATGGTAT GTTTATAACG ATGGGaGGCT TGaATAAATG 4920 

AACTGGaAAC TGACAAAGAC ACTTTTCATT TTCGTGTTTA TTCTTGTCAA CATCGTGTTA 4 980 

GTATCGATTT ATGTTAATAA AGTCAATCGC TCACACATTA ATGAAGTCGA GAGTAACAAT 504 0 

GAAGTTAATT TTCAGCAAGA AGAAATTAAA GTACCGACTA GTATATTGAA TAAATCAGTT 5100 

AAAGGTATAA AATTAGAGCA AATTACAGGG CGATCAAAAG ACTTTAGTTC TAAAGCTAAA 5160 

25 GGCGATTCGG ATTTGACCAC ATCAGATGGT GGAAAATTAT TGAATGCGAA CATTAGTCAA 5220 

TCGGTAAAGG TCAGTGACAA TAACTTAAAA GATTTGAAAG ATTATGTTAA CAAGCGCGTA 5280 

TTTAAAGGTG CTGAATATCA ATTAAGCGAG ATTAGTTCAG ATTCTGTAAA ATATGAACAA 534 0 

ACGTATGATG ATTTTCCGAT TTTAAATAAC AGTAAAGCGA TGTTAAACTT TAATATAGAA 54 00 

GATAACAAAG CGACTAGTTA TAAACAATCA ATGATGGATG ACATTAAGCC CACAGATGGT 5460 

GCAGATAAGA AGCATCAAGT GATTGGTGTG AGAAAAGCAA TCGAGGCATT ATATTATAAT 5520 

CGTTACTTGA AAAAAGGTGA TGAAGTCATT AATGCTAGAC TCGGTTACTA CTCAGTCGTG 5580. 

AATG&AACGA ATGTTCAATT GTTACAACCA AACTGGGAAA TTAAAGTGAA GCATGACGGT 564 0 

AAGGATAAAA CGAATACTTA CTATGTCGAA GCGACAAATA ATAACCCTAA AATTATTAAT 5700 

CATTAATATG AATCGTAATA AGCTAGCATT GCAAGCTCAT CATATGTGAG AAGCGGTGCT 576 0 

AGCTTTTTTG CTGGTACGGT TTATTATGGC TGATGTTTTT GCGTCTCCAA CGTGCGCATT 582 0 

TATTCATATT TTAAGTAGAA CCGCATTGTA AAATTAGTGT AACTGTTATT TTAAAAACTT 5880 

TAGTATTTGT CTAATCATTG TTATAATAAT TAAGAAATTC ATTGCACGTG ATTATCAAAA 594 0 

TTTAAATATA AGAAACCGGT CGATGAACTA AAGTTACATA ATAGGAAAGG TATACAAAAC 600 0 

SO AGCTAATATA CTGATAGTTT CTGTAGGGAA AATCGTATAT TTGCACTGAT GTATATTGCA 6060 

GTCATATAGA GAGATTGACT GTTTAAAGAG AAAGGATGAG CCGCTTGATA CGCATGAGTG 6120 
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TAGTTGATGT TGGTTTGACT GGAAAGAAAA TGGAAGAATT GTTTAGTCAA ATTGACCGTA 624 0 

ATATTCAAGA TTTAAATGGT ATTTTAGTAA CCCATGAACA TATTGATCAT ATTAAAGGAT 6300 

TAGGTGTTTT GGCGCGTAAA TATCAATTGC CAATTTATGC GAATGAAAAA ACTTGGCAGG 6360 

CAATTGAAAA GAAAGATAGT CGCATCCCTA TGGATCAGAA ATTCATTTTT AATCCTTATG 6420 

AAACAAAATC TATTGCAGGT TTCGATGTTG AATCGTTTAA CGTGTCACAT GATGCAATAG 6480 

ATCCGCAATT TTATATTTTC CATAATAACT ATAAGAAGTT TACGATTTTA ACGGATACGG 6540 

GTTACGTGTC TGATCGTATG AAAGGTATGA TACGTGGCAG CGATGCGTTT ATTTTTGAGA 6600 

GTAATCATGA CGTCGATATG TTGAGAATGT GTCGTTATCC ATGGAAGACG AAACAACGTA 6660 

TTTTAGGCGA TATGGGTCAT GTATCTAATG AGGATGCGGC TCATGCAATG ACAGACGTGA 6720 

TTACAGGTAA CACGAAACGT ATTTACCTAT CGCATTTATC ACAAGACAAT AACATGAAAG 6780 

ATTTGGCGCG TATGAGTGTT GGCCAAGTAT TGAACGAACA CGATATTGAT ACGGAAAAAG 6 840 

AAGTATTGCT ATGTGATACG GATAAAGCTA TTCCAACGCC AATATATACA ATATAAATGA 6900 

GAGTCATCCG ATAAAGTTCC GCATTGCTGT GAGACGACTT TATCGGGTGC TTTTTTATGT 6 960 

25 TGTTGGTGGG AAATGGCTGT TGTTGAGTTG AATCGGCTTG ATTGAAATGT GTAAAATAAT 7020 

TCGATATTAA ATGTAATTTA TAAATAATTT ACATAAAATC AATCATTTTA ATATAAGGAT 7080 

TATGATAATA TATTGGTGTA TGACAGTTAA TGGAGGGAAC GAAATGAAAG CTTTATTACT 7140 

30 TAAAACAAGT GTATGGCTCG TTTTGCTTTT TAGTGTAATG GGATTATGGC AAGTCTCGAA 7200 

CGCGGCTGAG CAGCATACAC CAATGAAAGC ACATGCAGTA ACAACGATAG ACAAAGCAAC 7260 

AACAGATAAG CAACAAGTAC CGCCAACAAA GGAAGCGGCT CATCATTCTG GCAAAGAAGC 7320 

GGCAACCAAC GTATCAGCAT CAGCGCAGGG AACAGCTGAT GATACAAACA GCAAAGTAAC 7380 

ATCGAACGCA CCATCTAACA AACCATCTAC AGTAGTTTCA ACAAAAGTAA ACGAAACACG 7440 

CGACGTAGAT ACACAACAAG CCTCAACACA AAAACCAACT CACACAGCAA CGTTCAAATT 7500 

ATCAAATGCT AAAACAGCAT CACTTTCACC ACGAATGTTT GCTGCTAATG CACCACAAAC 7560 

AACAACACAT AAAATATTAC ATACAAATGA TATCCATGGC CGACTAGCCG AAGAAAAAGG 7620 

GCGTGTCATC GGTATGGCTA AATTAAAAAC AGTAAAAGAA CAAGAAAAGC CTGATTTAAT 7680 

GTTAGACGCA GGAGACGCCT TCCAAGGTTT ACCACTTTCA AACCAGTCTA AAGGTGAAGA 774 0 

AATGGCTAAA GCAATGAATG CAGTAGGTTA TGATGCTATG GCAGTCGGTA ACCATGAATT 7800 

SO TGACTTTGGA TACGATCAGT TGAAAAAGTT AGAGGGTATG TTAGACTTCC CGATGCTAAG 7860 

TAcTAACGTT TATAAAGATG GAAAACGCGC GTTTAAGCCT TCAACGATTG TAACAAAAAA 7920 
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TGAAGGCATT AAAGGCGTTG AATTTAGAGA TCCATTACAA AGTGTGACAG CGGAAATGAT 8040 

GCGTATTTAT AAAGACGTAG ATACATTTGT TGTTATATCA CATTTAGGAA TTGATCCTTC 8100 

AACACAAGAA ACATGGCGTG GTGATTACTT AGTGAAACAA TTAAGTCAAA ATCCACAATT 8160 

GAAGAAACGT ATTACAGTTA TTGATGGTCA TTCACATACA GTACTTCAAA ATGGTCAAAT 8220 

TTATAACAAT GATGCATTGG CACAAACAGG TACAGCACTT GCGAATATCG GTAAGATTAC 8280 

ATTTAATTAT CGCAATGGAG AGGTATCGAA TATTAAACCG TCATTGATTA ATGTTAAAGA 8340 

CGTTGAAAAT GTAACACCGA ACAAAGCATT AGCTGAACAA ATTAATCAAG CTGATCAAAC 8400 

ATTTAGAGCA CAAACTGCAG AGGTAATTAT TCCAAACAAT ACCATTGATT TCAAAGGAGA 8460 

AAGAGATGAC GTTAGAACGC GTGAAACAAA TTTAGGAAAC GCGATTGCAG ATGCTATGGA 8520 

AGCGTATGGC GTTAAGAATT TCTCTAAAAA GACTGACTTT GCCGTGACAA ATGGTGGAGG 8580 

20 TATTCGTGCC TCTATCGCAA AAGGTAAGGT GACACGCTAT GATTTAATCT CAGTATTACC 864 0 

ATTTGGAAAT ACGATTGCGC AAATTGATGT AAAAGGTTCA GACGTCTGGA CGGCTTTCGA 8700 

ACATAGTTTA GGCGCACCAA CAACACAAAA GGACGGTAAG ACAGTGTTAA CAGCGAATGG 8760 

25 CGGTTTACTA CATATCTCTG ATTCAATGCG TGTTTACTAT GATATAAATA AACCGTCTGG 8820 

CAAACGAATT AATGCTATTC AAATTTTAAA TAAAGAGACA GGTAAGTTTG AAAATATTGA 8880 

TTTAAAACGT GTATATCACG TAACGATGAA TGACTTCACA GCATCAGGTG GCGACGGATA 894 0 

TAGTATGTTC GGTGGTCCTA GAGAAGAAGG TATTTCATTA GATCAAGTAC TAGCAAGTTA 9000 

TTTAAAAACA GCTAACTTAG CTAAGTATGA TACGACAGAA CCACAACGTA TGTTATTAGG 9060 

TAAACCAGCA GTAAGTGAAC AACCAGCTAA AGGACAACAA GGTAGCAAAG GTAGTAAGTC 9120 

TGGTAAAGAT ACACAACCAA TTGGTGACGA CAAAGTGATG GATCCAGCGA AAAAACCAGC 91B0 

TCCftGGTAAA GTTGTATTGT TgtAGCGCAT AGAGGAACTG TTAGTAGCGG TACAGAAGGT 924 0 

TCTGGTCGCA CAATAGAAGG AGCTACTGTA TCAAGCAAGA GTGGGAAACA ATTGGCTAGA 93 00 

ATGTCAGTGC CTAAAGGTAG CGCGCATGAG AAACAGTTAT TTCATAATCA ACAGTCATTG 9360 

ACGTAGCTAA GTAATGATAA ATAATCATAA ATAAAATTAC AGATATTGAC AAAAAATAGT 9420 

AAATA 9425 
(2) INFORMATION FOR SEQ ID NO: 88: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3886 base pairs 
so (b) TYPE: nucleic acid 

<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

AGTTGTAATG TCACATTTCC AGAGTCTGAA ATTATCTTTA TCACGTTACA TTTACTAGGC 60 

TCTAAAATGA CTGAACATAC AGCATCTTCA ATTACCTTTG AATACCATGA TTTATCGCAA 120 

AATATACATG AATTGATCAC TTGTGTTAGC CAAGAATTAG GCATTGATAT GTCAAAAGAC 180 

AACAAGTTAC ATACCAGTCT GATCACACAT ATCAAACCAG CTATACATCG TATTAAATAC 240 

GATATGCTAC AACCTAATCC TTTGAGGCAA GAAGTTATGC GTCGCTATCC TCAAATCATT 300 

GAAGCCGTTA GCAAGCATAT TAGTCCAATT GAACAAGATG CTGCTATTCG CTTCAACGAA 360 

GATGAATTAA CATACATTAC AATTCACTTC GCATCAAGTA. TAGAGCGTGT TGCAACACAT 420 

AAACAATCAA TGATTAAGGT TGTCTTACTA TGTGGTTCTG GTATAGGCAC GTCACAACTT 480 

TTAAAATCAA AACTAAATCA CCTGTATCCT GaGTTnCACA TTTGGGAtGc CTATTcCATT 540 

TaTcAATTGG aAGaAAGTCG ATTATTGCAA GATAACATTG ATTATGTCAT TTCAACAGTA 600 

CCTTGTGAAA TATCAGCTGT ACCAGTTATT CATGTCGATC CATTTATCAA TCAACAATCT 660 

CGTCAAAAAT TGAATCAAAT TATCAATGAC TCAAGAGAAC AACGAGTCAT GAAAATGGCA 720 

2 $ ACTGATGGCA AGTCACTCGC AGATTTATTG CCTGAACATC GCATCATTAT AAATAAACAA 780 

CCATTATCAA TTGAATCCGC AATTGCAGTG GCTGTGCAAC CTTTAATCAA TGATGGCATT 84 0 

GTCTATT CAA ATTATACAGC TGCAATTTTA AAACAATTTG AACAATTCGG GTCATATATG 900 

30 GTCATTAGTC CACATATTGC ACTTATTCAC GCTGGTACTG ATTATGTACA GAATGGTGTA 960 

GGTTTCGCAC TAACATATTT CACTGAAGGG ATTATCTTTG GTAGTAAAGC TAACGATCCC 1020 

GTTCACCTTG TAATTACATT AGCAACGGAC CACCCCAATG CACATTTAAA GGCATTGGGA 1080 

CAGTTAAGCG AATGCTTAAG CAACGACTTA TATCGACAAG ATTTCTTAGA TGGGAATATT 1140 

TTT^AATTA AACAACACAT TGCTTTAACT ATGACAAAGG AGGCTTAATA ACGTGTCATT 1200 
AGACATTTTG TCAACAACAC GCATCATTGT AAAAGAACAA GTAAATGATT GGACTGAAGC . 1260 

TATAACTATA GCTTCTCAGC CATTACTACA AGAACAAATT ATTGAACAAG GCTATGTTCA 1320 

AGCAATGATT GATAGCGTTA ATGAACTTGG ACCTTATATC GTTATCGCAC CTGAAATTGC 1380 

AATTGCACAT GCAAGACCGA ACAATGACGT ACATCAAGTT GGTTTAAGTC TATTAAAGTT 1440 

GAATCAACAT GTGGCATTTT GTGATGAAGA TCACTACGCA TCTCTCATTT TTGTATTGAG 1500 

TGCCATCGAC AATCATTCAC ACTTATCTGT ATTACAAAAT TTAGCAACCG TACTGGGCGA 1560 

so TAACCAAACA GTCCAGCAAC TATTAACTGC AACAAATGCA CAAGACATTA AAAACATTTT 1620 

AAAGGAGCAT GATTAATATG AAAATTTTAG TAGTATGTGG CCACGGTTTA GGAAGTAGTT 1680 
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AAGTTGAACA TAGTGACATT ATGACAGCAA GTCCAGAGAT GGCTGACTTG TTTATTTGTG 1800 

GTAGAGATTT AGCTGAAAAT GCCGAACGTC TAGGGGATGT CTTAGTTCTT GATAATATTT 1860 

5 

TAGATAAAGC TGAATTACAA CAAAAGCTCT CAGAAAAATT ACAACAACTT AACATGATTT 1920 

AAAGGAGGTA CGACCTATGC AAGCAATCCT TAATTTTATA GTCGATATTT TAAGTCAACC 1980 

AGCCATTCTT GTTGCACTGA TTGCCTTTAT AGGTTTAATC GTTCAGAAAA AACCTGCCGC 2040 

70 

AACGATCACT TCAGGAACCA TTAAAACGAT ATTAGGCTTC TTAATTTTAA GTGCAGGTGC 2100 

TGATGTCGTC GTTCGATCTC TTGAACCATT CGGCAAAATA TTCCAACACG CATTTGGTGT 2160 

1$ GCAAGGTATC GTACCTAACA ACGAAGCTAT CGTCTCACTA GCCTTAAAAG ATTTTGGAAC 2220 

AACAGCTGCA CTCATCATGG TCTGTGGCAT GATTGTTAAT ATTTTAATTG CCCGCTTCAC 2280 

TAATTTAAAA TATATCTTTT TAACAGGTCA TCATACATTT TACATGGCTG CGTTTTTAGC 234 0 

20 AATCATTTTA ACAGTCAGTC ATATTAAAGG CTGGCTAACG ATTGTTATCG GCGCACTCGT 24 00 

ATTAGGATTA ATCATGGCAG TATTACCTGC ATTACTCCAA CCTACGATGC GAAAAATTAC 2460 

AGGGAATGAC CAAGTAGCTT TAGGTCATTT TGGCTCAATC AGTTACTTTG CCGCAGTGCT 2520 

25 GTAGGTCAAT TATTCAAAGG TAAGTCTAAA TCAACGGAAG AGATTAAATT TCCAAAAGGC 2580 

TTAAGTTTCT TACGAGAAAG TACAATTAGT ATCTCGATTA CGATGGCATT ACTTTACTTC 2640 

ATCGCATGCT TATTTGCGGG CGTTAGTTAT GTACACGAAT CTATTAGTGA TGGTCAAAAC 2700 

30 

TTTATTGTCT TTTCATTAAT TCAAGGTGTG ACATTTGCTG CTGGTGTATT TATTATTTTA 276 0 

ACGGGCGTTC GTTTAATCTT AGCTGAAATC GTCCCAGCAT TTAAAGGAAT TTCTGAAAAG 2820 

CTTGTACCAA ATTCTAAACC TGCATTAGAC TGCCCTATTG TGTTCCCTTA TGCACAAAAT 288 0 

35 

GCAGTATTAA TTGGATTCTT TGTCAGCTTT ATTACAGGTG TCATCGGTAT GTTTATCTTA 294 0 

TTCTTATTTG GTGGCGTCGT CATTTTACCT GGCGTAGTTG CACACTTCTT CTTAGGTGCA 3000 

ACGGCTGCTG TATTCGGTAA TGCAAGAGGC GGTATTAAAG GTGCTATTGc TGGCGCCGCT 3060 

40 

CTAAATGGTA TCCTAATCAC GTTTTTACCA TTATTATTCT TGCCATTTTT AGGCGAATTA 3120 

GGTGGTGCTG CAACAACATT CTCAGATACA GACTTTTTAG CTGTCGGTAT CGTGTTCGGT 3180 

45 AACGCAGTAA AATATATGGG ATTATTTGGT GCGATTCTAT TTATTATTAT CGTAGGTGCG 324 0 

ACAACAATTT TATTAAAAGG CCGTCAAAAA GAACAGCAAT AGTGTTAACG TAGAAATATA 3300 

AAACACCGTC ACATATTGAG TGAATGCCCC TTTtATCAAG AGGAAAGCCA CTTACTTATG 33 60 

50 GACGGTGTTT TGTATTATAT TAAATGATAC TTAGCCATAC TATCGACAGC TGCTAAAATT 34 20 

GCTTCTTCTT GTGTCGCAAT CGGTTCCCAA CCAAGTAATG TTTTTgCACG TTCGTTACTT 34 80 
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CCTAGACTCA AAATAAAGTC TGGTAATTTT TTAGTAGAAA CTTTTTGAGC TATTTCAGGT 3600 

CTCTTTTCTT TAATTAATTT TGCAATTTCC AACAAATTAA TTTGTCCATC AGCCGTCGCA 3660 

5 ATAAATCGCT TGCCATTAGC TTGTTCATTT GTCATTGCCA AAATGTGCAG TTCAGCTACG 3720 

TCTCTCACAT CAACAACATT TAACGGAATT TGCGGTACAC GTTTCATTGA ACCATTCAAT 3780 

AAATTTTCTA ATAAATGAAA GCTTCCTGAA ACGTGTGCAT CTAATGATGG CCCAAAAATT 3840 

10 

GCAACTGGAT TGATTGTGGC AAATTCTACT GTTGTATTTT CATTCT 3886 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 4 879 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

20 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

GTCATCTATC AAAAATTTGG TATACAGACC GACAATTATT AATTAATAAT TTAATTTCCC 60 

25 • AGGCAATACC AGTGATTAAA TATCCACAAA TACAACATAA AGAACAACCA TTAGAATCTA 120 

TTTCACAACT TATATTGTCT AAGATGACAT CTAATCAATA GTGTTTAAAT TTCTCAGTGG 180 

CTGTGAATGA GGTTTAAAAG TACTATAAAA CGTAAACTTT GATACTTTAA AATACGCAAA 240 

30 AAACGGTAAA CCCTAATTCA TATTATAGAG TTTACCGTTT TATTTTTTAA CTTGCATCAT 300 

AGTTATATTA ACATTATTGT TGGTAGTTTG GATCAGTAAC CATTGCTTGT CCAGTATAAT 3 60 

CAACCGTTAC AATTGAATAT TTTCCaTTTG CATTTGGGTC TTTAAAACTA AACACATACT 420 

35 

TATAGTTGCC ATTATGTTCT TCAATAGAAT AATCATTATA CACTTTATTA TTACTACCAA 480 

ATTT&TTTGC TTCATTATTA GCCGCATTTA AAGCTGTTTG GAAATTTGGC AATTGCTGTA 540 

AAGCTTGATT TTTATTTCCA TTAAACGGAT AAATTTGACG TGCAACCGGC GCGGCATTTT 600 

40 

GnCCATAATA TGGTGCAACG TAACTTGATT TTTGATTATT ATTCGCTTGG TTATTACTTG 660 

ATTGGTTATT ATTTGTTTGG TTTTGGTCAT TGTTTGTTGC ATTTGAATTA GATTGTTGCT 720 

GGTTATCGTT TGCACTATTA TCTTTATTAT CTTTGTTTAC GTCTTTACTA TCATCTTTAT 780 

45 

TATCTTTCTT ATCTTTAGAT GAATCATTTG TTTTTTTATC TTGTTGTTCA GTTTTCGCTT 840 

TATCATCTTT TTCTTTATTA CCGTCTTTTT GTTGGTCACT ATCTTGACCA CATGCAGCTA 900 

50 AAAATAATGA TAATGCTAGT AACCCTGTAA CTAATCTTTT CATACATATC TCCTCCTATA 960 

ATTCGATATT CATTGAATAA TCTTGAAATA CATATCTACC ATGTGTATCT TTTCATGGCT 1020 

55 



559 



EP0 786 519 A2 



10 



15 



TAAGGTTCTT TTTATTATAC CCTAATTTTT GTTCATTATT ATTTAATTTT TGTGAATTTT 1140 

ATGtTTkCTA TAAATTTAAT TATTTTACTT TAACAATTCA TTACGCATTT AGCATTTCAA 1200 

GGTATACACA ATATTTATTA CTATGATTTC ATTTTATCTG CTGCAAAAAC AATCATTATA 1260 

ACTCTTTTTC CATAATTAAA TCTGTATCCG TTACATCACC TGTTTGAAAA TGATGTTCAC 1320 

CAACCACTTT AAATCCATGA CGTTTATAAA ATGCTTGAGC ACGAGGATTA TGCTCCCAAA 13 80 

CTCCTAGCCA AATTTTATGT TTATTATGTT CTTGAGCAAT TTTTTCGGCC AATTCTATCA 1440 

ATTGTGAACC TCTTCCGCCA CCTTGAAAGT CTTTCAAAAA ATATATGCGC TGCACTTCTA 1500 

AATAGGTCTC CCCCATTTCT TCAGTTTGAG CACTATTAAT ATTCATCTTT ATATAACCAA 1560 

CATTCGCACC ATCTTCTTGa TAAAAATAAT GAAATGAATC TACATGGTTA ATCTCTTGTG 1620 

TAAATTTCTC TACAGTATAA TTGTCTTTAA AAAATTGATC AAAATCTTTG TCATCATAGT 1680 

20 AAGAACCAAA CGTGTCATAA AATGTTCTAG TTGCTAATTC AACTAATTCA CTAGCATTTT 1740 

GTTCTGAAAT TTCTTTGATT ATCCCAGCCA TATAAATCCT CCAATAAACA GTGATCGAAT 1800 

CAAAATATTA CTTATGTTAT TTTTCAGCCA AAACTATTTA AAAATACATT AACACAAATC 1860 

25 AATTACAAAT TGTATTGATT GTGTGTAACA TCAATAAATG ATACATTTAT TCCAGTAAAA 1920 

TGGCCGTATT TTCAAAAGAG AAAAAGAGAG GATGTATGGT TGTGATAGAA ACATTTAAAG 1980 

CGTTTGTAAT TGATAAAGAT GAGAGTGGTA AAGTGACACC AACTTTCAAA CAATTATCGC 2040 

CTACTGATTT ACCTAAAGGA GATGTGCTGA TTAAAGTACA TTACTCTGGT ATAAATTATA 2100 

AAGATGCTTT AGCGACTCAA GATCATAATG CAGTCGTAAA ATCGTATCCT ATGATTCCAG 2160 

GAATAGATTT AGCTGGAACA ATTGTTGAAT CCGAAGCACC AGGCTTTGAa AAAGGAGAAC 2220 

AAGTAATTGT AACGAGTTAT GACCTAGGTG TCAGCCATTA TGGCGGTTTT AGTGAATATG 22 80 

CGCGTGTAAA ATCAGAATGG ATTATCAAGC TTCCTGATAC TTTAACATTA GAAGAATCAA 234 0 

TGATATATGG CACAGCTGGT TATACTGCCG GTTTAGCAAT TGAAAGACTT GAAAAAGTTG 2400 

GAATGAATAT TGAAGATGGT CCTGTACTCG TTCGCGGTGC TTCAGGTGGT GTCGGTACTT 24 6 0 

TAGCAGTACT CATGCTTAAT GAACTTGGTT ATAAAGTTAT CGCAAGTACA GGTAAACAAG 2520 

ATGTTAGCGA TCAATTACTT GAACTTGGTG CCAAAGAAGT TATCGATCGA CTTCCTGTTG 2580 

AAGATGATCA TAAAAAGCCA CTCGCATCAT CAACTTGGCA AGCTTGTGTA GACCCTGTTG 2640 

GTGGCGAAGG TATTAATTAT GTTACAAAGC GTTTAAATCA TAGTGGGTCA ATTACAGTTA 2700 

SO TTGGTATGAC TGCCGGTAAT ACTTATACTA ATTCTGTATT CCCTCACATT TTAAGAGGTG 2760 

TAAACATTTT AGGAATTGAC TCGGTATTTA CTGCTATGAA ATTAAGACAG CGCGTTTGGC 2820 
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TTGATGAACT TCCAGAACAA CTTAACAAAG TAATTAAACA TGAAAATAAA GGGCGCATTG 2940 

TTATCGATTT CGGTGTAGAT AAATAGTATT CATGAAAAAG ACATCCCGTT ATGCGAGATG 3000 

TCTTTTTTAA TTTAGTATTT GATATACATA CCGCCTGAAT CTGGTTCGGT AGGTATAAAT 3060 

CCAAATTTTG TATATAATTT ATCCGCTGGG TAGTCTGCAA TCAGAcTAAC GTATGTACTC 3120 

TCAACAGCCA CACCTTTAAT ATATTGCATA ATATGCTCCA TAATTAGACT GCCGTAACCT 3180 

TGACCTTGGT AACTTTTCAA AACTGCAATA TCAACAATTT GAAAAACAGT TCCGCCATCG 3240 

CCAATCACTC TACCCATACC AATTAACCGA TCTTTATCAT ACAAGGTTAC TGTAAATAAG 3300 

GCATTAGGTA ATCCTTTTTC aGCTGTTCGC GCGTCTTTGG ACTCATACCT GCGTTAATCC 3360 

TTAATGCGCA ATAATCCTCG CAAGTCGGAA TATCATATGT CACTTTAACC ATTATTTACC 3420 

CCACTTTTCA TCACACAATA TATCAACCTA GTATAAATGT TTATTTACAA TAGTCTTATT 34 80 

CGCTTCTTTA AACACTTCAT GATGACTTGA AACATAACCC TCTGCATTCG CATCTGGTTG 3540 

GATATATGTT TTAGCAAGGT TCGCTGCATT TGCACCATCA CTAAATGCAC TTGCAATTAG 3600 

ATGTGATTTT GCATCATGAT AAACAATATC TCCACACGCA TAGATACCAG GTATACTAGT 3660 

2 S TGTCGTATTA CCAAATCCTT TAACACGACA ATCATCATGC ATATCTAGCT TTGAAGATGT 3720 

TtCACTCAAT AATGTATTAC AACGATCAAA CCCATGACTA ATAATGACAT CGTCAAATTT 3780 

AACTGTATGC CTATCGCCAC TTTCAACATG TTCCAAAACA ACTTCACTTA TATGCGTTTC 3840 

30 ATCATCATTG CCGACCAAGT ATTTAATACG TGTTTTTGGG CATAGTTTCA CATTTAAATC 3 900 

TGTCACCAAC GTTTTCATCG CTTCATGACC ACTTACATCT TCTTTTCGAT AAACAACTGT 3 960 

CACGCTTTTA GCAATCTTGG CAATATCATG CGCCCAATCT AATGCTGTAT TTCCTCCACC 4020 

TGATATTAAT ACATCTTTAT CTTTGAAACG TCTGTAACTT TGTACAACAT AATGTAAATT 4080 

AGTT5ATTGA TATCTCTCTA CACCTTTAAC ATCTAATTGT TTTGGATTAA TAATACCCGC 4140 

ACCAATTGCA ATGATAACTG CTTTCGATGT ATATATTTCT CCCGCTTCTG TTTCAACTTC 4200 

GAAATGACGT TCTGCCTTTT TCCTAATATC TACCACACGT TCATTCAAAT GAACTTCCGG 4260 

TTTAAAATAT AATCCTTGCT TAATTGTATC TTTTAAAATT TCATGACAAG GTTTTGGCGC 4320 

AATGCCGCCA ATATCCCAAA TAATTTTTTC AGGGTAAATT CTCATCTTAC CCCCTAATTC 4380 

AGATTGAACA TCTATCAATC TTACAGACAT ATCTCGCAAT CCAGCATAAA AGCTTGCATA 4440 

CAAACCAGAC GGACCGCCAC CAATGATTGT AACATCTTTC ATTATGTGCC TCCTATGACT 4500 

SO CTCTATATTC ATTTCTTTCA TTAACGTGCT CAAATTGATA ATTATTATCA TTTAAAGCCA 4 560 

TTATACTATT AATATTTATA TTGTTAAAAT AAATCGCATA GTTAGCCATG AATTATCAAT 4620 
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GAAAGATGTG TATATTTTTT AGTTCTAGTT ATATTATTTT TTAAAAGACT CATCACGTGG 474 0 

TTCTTTAAGA ATTGCTTGTC TTAAAAGGAA AAATAGCAAC AATAAACCTG CAAGCATACC 4800 

TGTGTGCCCA ATACCTGCAA AGCCTGCnAA TGCTTCTGGA GAGTATGATT TACCAGTGAC 4860 

TTGGAAGAAT CCTTTTGTC 4879 



(2) INFORMATION FOR SEQ ID NO: 90: - 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1560 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

75 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

20 ATAATGTCTT AGaTTGATTG GGAGTTTTTT TAATTTTTTT GAAATTAAAT TAATCTGTAs 60 

yTAATAAAAA ATTTGAATAA CTGACACAyT TTTTTGATCA TAGCTAyATA CTTTGTGAAT 120 
TAATTCACAT TATAATAAGA GTGAAGATAA GAGTATTATA AATnATCTTT AAATAAATAT 180 

25 ATGTGAAGTA AAAATTACAC GTTAGCATAT CGATTATGgT CATTTCkTTT AACATATTAA 24 0 

CTgGGGaACG TTAAAAGTTA ACGGkTGATA TCyAACtAAA AACAAGGTCA CAGTAGTATG 300 
TTTTAATCTG GCGTCTATTA CAAATAAAAA TTACATCTAT AATTATTCGT TTTCTTTTTT 360 

30 GAAAGTAATA GCCAATTAAT ATCATACATA CTGGAGTGAC TATAAGGAGG ACATTATTAT 420 
GAGAGCAGCA GTTGTAACGA AAGATCACAA AGTAAGTATT GAGGACAAAA AGTTAAGAGC 4 80 

TTTAAAACCT GGTGAAGCGT TGGTACAAAC GGAATATTGT GGCGTTTGTC ATACCGATTT 540 

35 

ACATGTTAAG AATGCTGATT TTGGTGATGT TACAGGCGTT ACTTTAGGTC ATGAAGGTAT 600 
TGGTAAAGTC ATCGAAGTTG CGGAAGATGT AGAATCATTA AAAATTGGAG ACCGTGTGTC 660 
TATCGCTTGG ATGTTCGAAA GCTGTGGAAG ATGTGAATAT TGTACAACAG GTCGTGAAAC 72 0 

40 

ACTTTGCCGT AGTGTGAAAA ATGCTGGTTA TACAGTAGAT GGTGCAATGG CTGAACAAGT 780 
TATTGTTACT GCAGACTATG CTGTGAAAGT ACCTGAAAAA TTAGATCCAG CAGCAGCGTC 840 

4S TTCTATTACA TGCGCAGGTG TGACAACTTA TAAAGCTGTA AAAGTAAGTA ATGTAAAACC 900 
TGGACAATGG TTAGGTGTTT TTGGTATAGG TGGTTTAGGT AACCTAGCTT TACAATATGC 960 
TAAAAACGTT ATGGGGGCTA AAATTGTTGC CTTCGACATC AATGATGATA AATTAGCATT 1020 

50 CGCGAAAGAA TTAGGTGCTG ATGCTATTAT TAATTCTAAA GATGTTGATC CAGTTGCAGA 1080 

AGTTATGAAA TTAACTGATA ACAAAGGATT AGATGCAACA GTGGTAACTT CAGTTGCTAA 114 0 
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TTTACCTGTT 


GATAAAATGA 


ACTTAGATAT 


CCCAAGATtA 


GTGCTTGATG 


GTATTGAAGT 


1260 




AGTAGGTTCA CTTGTTGGTA CAAGACAAGA CTTACGTGAA GCGTTTGAAT 


TTGCTGCTGA 


1320 


5 


AAATAAAGTA 


ACACCTAAAG 


TTCAATTAAG 


AAAATTAGAA 


GAAATCAATG 


ATATTTTTGA 


1380 




AGAAATGGAA AATGGTACTA 


TAACTGGTAG 


AATGGTTATT 


AAATTTTAAA 


AATATCAACT 


1440 


10 


GACTATATAG 


ATAAAGAAGG 


TAGTGCTCTG 


AACACTATCA 


TTATTAATCA 


AACCCCGAGG 


1500 


TTTTCCTGAA AAGATAGTGG 


nAAATCCCCG 


TGTTTTTTGG 


GTTTGAGGnG 


GTTGTnTGTA 


1560 




(2) INFORMATION FOR SEQ ID NO: 91: 








15 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11014 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 








20 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 








GTCCTGTtlGC 


TGCAATGAAT 


ACGCCTAAAA 


ATCCAGGGAT 


GTAATGGATA 


CTTTGTGGTA 


60 


25 


GTACTAATGA 


TAGAAATGAT 


AAAAATGAAA 


TCACAAAGGC 


TACGCTCGCA 


AAAGCTTGAC 


120 




ATGTACGCTT 


ATCGCCATAA 


TCTAACCCTG 


TACGTATATG 


TAATAAATAC 


TGTAATCCGA 


180 




TACTTAAATA 


CATAATTGCC 


ACGCATAAGA 


AGAATGGGAA 


GAATGTCTTT 


TCAAAGTCCG 


240 


30 


GATATAGGCT 


GTTAGATAGG 


AAGACCATGA 


TGAACATATT 


AAACATCATA 


AACGAGACGT 


300 




CTTTGAATGT 


AACTTGACCA 


AATCGATTTG 


TAAAAAATGT 


TTGATGAGAC 


CACATTAACC 


360 


35 


ATAAGAACAA 


ACTCATGACG 


ATGTATTTGA 


AAAATAAATC 


AGCTGAAATG 


GAACCGTTTT 


420 


GTGTTGTTAA 


AATCACATGT 


GCAATTTTTT 


GAATGGCATA 


GACGAAAATT 


AAATCAAAGA 


480 




ACAACTCATG 


GAATCCTGCA 


CGCTTTTCAG 


CTAAATGTTT 


TGGTGTTAAT 


GCATTAACCA 


540 


40 


TAAAATTTTA 


ACTCCTTTAA 


GATGTGTAAT 


TAATTTACTA 


AGTATACTAT 


TTATTTTTTC 


600 




TAGTGAATAG 


GGGCAGATTT 


GGCGATGAAG 


TGGAAGGAGA 


GGTGACTGCA 


AGGTAATTGC 


660 




GGAATTAACA 


ATCATCAGCG 


ATTTAATATT 


TGACTGGAGA 


CGTCATGGTA 


ATAAAAAATT 


720 


45 


GATGAGAAAT 


TGATGGTGAA ACCAGCTGTG 


AATAsOGaTG 


cAATGATrsA 


TAGaATTTAA 


780 




TTAGAGTCAT 


TACGCGaAAT 


GATTAATGAT 


AATTTGTGGT 


AAATCAAAGC 


aTAATTTTGT 


840 




ACTATAGATG 


AGGATGATAG 


AGCATATTTA AGAGGGTGAA 


ATGTTAAAGT 


GAAACCGTTT 


900 


50 


ACGTTTCCGA 


TTGCCCAAAC 


AAATTACATC 


ATTGTATAAT 


ATGATTTGTT 


AAATGCATAA 


960 




CAAGAATGAA 


AATGTAACAT 


ACGTAGCAAT 


TGGTTTCATA 


AATTGGATGT 


TAGTGGCGTA 


1020 
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TGACGAGAGT CGTATTAGCA GCAGCATACA GGACACCTAT TGGCGTTTTT GGAGGTGCGT 114 0 

TTAAAGACGT GCCAGCCTAT GATTTAGGTG CGACTTTAAT AGAACATATT ATTAAAGAGA 1200 

CGGGTTTGAA TCCAAGTGAG ATTGATGAAG TTATCATCGG TAACGTACTA CAAGCAGGAC 1260 

AAGGACAAAA TCCAGCACGA ATTGCTGCTA TGAAAGGTGG CTTGCCAGAm ACAGTACCTG 1320 

CATTTACGGT GaATAAAGTA TGTGGTTCTG GGTTAAAGTC GATTCAATTA GCATATCAAT 1380 

CTATTGTGAC TGGTGAAAAT GACATCGTGC TAGCTGGCGG TATGGAGAAT ATGTCTCAAT 144 0 

CACCAATGCT TGTCAACAAC AGTCGCTTTG GTTTTAAAAT GGGACATCAA TCAATGGTTG 1500 

ATAGCATGGT ATATGATGGT TTAACAGATG TATTTAATCA ATATCATATG GGTATTACTG 1560 

CTGAAAATTT AGTAGAGCAA TATGGTATTT CAAGAGAAGA ACAAGATACA TTTGCTGTAA 1620 

ACTCACAACA AAAAGCAGTA CGTGCACAGC AAAATGGTGA ATTTGATAGT GAAATAGTTC 1680 

20 CAGTATCGAT TCCTCAACGT AAAGGTGAAC CAATCGTAGT CACTAAGGAT GAAGGTGTAC 1740 

GTGAAAATGT ATCAGTCGAA AAATTAAGTC GATTAAGACC AGCTTTCAAA AAAGACGGTA 1800 

CAGTTACAGC AGGTAATGCA TCAGGAATCA ATGATGGTGC TGCGATGATG TTAGTCATGT 1860 

25 CAGAAGACAA AGCTAAAGAA TTAAATATCG AACCATTGGC AGTGCTTGAT GGCTTTGGAA 1920 

GTCATGGTGT AGATCCTTCT ATTATGGGTA TTGCACCAGT TGGCGCTGTA GAAAAGGCTT 1980 

TGAAACGTAG TAAAAAAGAA TTAAGCGATA TTGATGTATT TGAATTAAAT GAAGCATTTG 2040 

CAGCACAATC ATTAGCTGTT GATCgTGAAT TAAAATTACC TCCTGAAAAG GTGAATGTTA 2100 

AAGGTGGCGC TATTGCATTA GGACATCCTA TTGGTGCATC TGGTGCTAGA GTATTAGTGA 2160 

CATTATTGCA TCAACTGAAT GATGAAGTTG AAACTGGTTT AACATCATTG TGTATTGGTG 2220 

GCGGTCnAAC TATCGCTGCA GTTGTATCAA AGTATAAATA ATAAGAAAAC AGGTTATCAC 2280 

AACA$TATTA ATtACATGTT GGCATAACCT GTTTTTATTT GTTTATGGAT TTATTGGGTA 234 0 

ATATTAGTCA TTTGATGGTT TAATTGCAAA TGCTCTAACA GGGAACCCAG GTGCATCTTT 24 00 

40 

TGGTTTAGGG CTGATAGCGT AAATGATGGC GCCACGAGTT GGTAATTGAT CTAAATTAGT 24 60 

TAATAACTCG ACTTGGTATT TATCCTGACC AAGAATATAA CGTTCGCCAA CTAAATCACC 2520 

45 ATTTTTTACA ACGTCCACAG ATGCATCGGT ATCGAATGTT TCATGACCAA CAGCTTCAAC 2580 

ACGACGTTCT TCAATTAAGT ACTTCAAAGC ATCTAATCCC CAACCCGGTG CATGTTGTTG 2640 

TCCGTTCGCA TCTTTGTTTT CAAACTTTTC AATATTAGGC CAACGTTTTG ACCAATCGGT 2700 

ACGAAGTGCA ACAAAAGTGC CAGGTTCAAT AGTACCATGC TCTTTTTCCC ATGCTTCTAT 2760 

ATGCGCACGT GTTACGATGA AATCATTGTT GTTCGCTACT TCTGTTGAAA AGTCTAATAC 2820 
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AAAGTGAATT GGTGCATCAA TGTGAGTACC ATATTGCGTT ACAATATTCC AACGTTGCAC 2940 

ATAGAAACCA TGATCTTTAA CCGTGAATAA AGTTGAAACT TCGCCTTTTT CAAACTCACT 3000 

AAAACGTGGT ATTTCCGGAT CAAATGTATG CGTTAAATCA ACCCAAGTTG CTTGTTTTAA 3060 

AGTATTTAAT TGTTGCCATA AAGGATATTG TGTCATAAAA TCACCCGTTT TTAGTTTATT 3120 

ATATGATAAA TGCTGCGATT ATTCTTGGCG TTTAGCTTTA ACAGCATTCA CAAGCACAGT 3180 

CAATGCATCT TTAACTTCTT CTTCTTTTCG CGTTTTTAAA CCACAGTCAG GGTTTACCCA 3240 

GAATAATGAG CGGTCGATTT GTTGTAGTGA ACGATTGATT GCTGTAGTAA TTTCTTCTTT 3300 

TGTTGGAATA CGTGGACTAT GAATATCATA TACACCTAGA CCAATACCTA AATCATAATT 3360 

AATATCTTCA AAGTCTTTAA TTAAATCACC ATGGCTACGA GATGTTTCAA TTGAAATAAC 3420 

ATCAGCATCT AAGTCATGAA TAGCATGAAT GATTTGACCG AATTGAGAAT AACACATATG 34 80 

20 TGTATGGATT TGAGTTTCAT CACGAACTGA AGACGTTGCA AGTTTAAATG ATAAAACAGC 3540 

ATCTTTAAGA TATTGTTCGT GATATTCAGA GCGTAATGGT AAGCCTTCAC GTAATGCAGG 3600 

TTCGTCAACT TGGATAACTT TGATTCCTGC AGCTTCAAGT GCTAATACTT CTTCGTTGAT 3660 

25 TGCTAAAGCA ATTTGATCTT GAACGACTTT ACGTGGTAAA TCAACACGTT CAAATGACCA 3720 

GTTTAGAATT GTTACAGGTC CAGTTAACAT ACCTTTAACT GGTTTATCTG TTAAGCTTTG 3780 

TGCATAAACT GTTTCATCAA CAGTTAAAGG CGCTGTCCAT TTTACATCAC CATAAATGAT 3840 

TGGTGGTTTT ACGGCACGTG AACCATATGA TTGCACCCAA CCGAATTTAG TTACTAAGAA 3900 

ACCTTGTAAT TTTTCTCCGA AGAATTCAAC CATGTCATTA CGTTCAAATT CACCGTGAAC 3960 

TAATACATCT AAGCCAATGT CTTCTTGAAT TTTAATCCAT CGAGCAATTT CATTTTTTAA 4020 

GAATGTTTCA TATGCTTCGT CTGTAATGCG TTTGTTCTTC CAATCTGCAC GGTATTTTCG 4080 

AACTTCTCGG CTtTGTGGGA ATGATCCAAT AGTTGTTGTT GGTAAATCCG GTAAGTTCAA 4140 

ACGTTTTTGT TGTTGTTCAA TACGTTGCGC GAATGGTGAT TGTCTTGAAG TACGCACGCT 4200 

TTCGAAATCA TAATCTAAGT TTTTGAATGA TTGATTTTGG AAACGCTCAT AACGTGCTTT 4260 

TAATTTATCA TATTTAACAC TATCGTTTTG ATTAAATAGG CGACGCAATG CATCTAATTC 4320 

45 GTCTAATTTT TCAGTTGCAA AGCTTAAGCC TTCGCCAACA CTTGTATCTA ATGTTTCATC 4380 

ATCTAAAGAT ACTGGAACAT GTAATAATGA AGATGATGGT TGAATGACAA GTTCATTAGT 4440 

GTGTGCTAAC AATTTATCGA TTAAGACTTT TTTAGCTTCA ATGTCACTTG CCCATACATT 4500 

50 ACGACCATCA ATAATTCCAG CGTATAATGT TTTTGATTTA TCAAAATCTC CAGCTTCAAT 4560 

TTGTTTAAGG TTATAGCCAT TATCATGGAC AAAGTCTAAA CCTATACCAC CAACAGGTAA 4620 
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AACACCAGCT TTTTCGAAAT AGTCATAAGC TTCACGTGTA ATATTTTCAT AGCTTTCGCT 474 0 

GTCGTCTGTA ACTAAGATTG GCTCATCAAC TTGAATGTAC TCAGCACCTG CATCAATTAA 4 800 

TGATTCAAAC ACTTCTTTAT AAAGTGGTAA TAACGTTTTA ACTTTTTCTT CAAAAGTTTG 4 860 

GTGACCGCCT TTTGATAATT TAACAAAAGT AATCGGACCA ACAATGACAG GGTGAGCGTT 4920 

AACGTTTAAA GATTGGGCAT ATTTAAAGCG ATCTAATAAT ACATTGCGAC TCACTTTAGG 4 980 

CTCAACATTG TCCCATTCAG GTACGATGTA ATGATAGTTA GTGTTAAACC ATTTTATAAG 5040 

TGCACTTGCA ACATGGTCTT TATTACCGCG AGCAATATCA AATAATAAAT CATCATCAAT 5100 

AGTTCTTCCT TGGAAACGTT CAGGGATGAT GTTGAATAAT AATGACGTAT CTAATATATG 5160 

GTCATATAAA GAGAAATCAC CAACTGGGAT GCTATCTAAG TGATAGTACT TTTGtAATAA 5220 

TAAATTTyCT TTATGTAGAT CAGTTAATGT TTGATCTAAT TCTTCTTTAG AAATCTTCTT 5280 

20 TGCCCAATAA CTTTCGATGG CTTTTTTCCA TTCTCTTTTT CTACCTAATC TTGGGAATCC 5340 

TAAGTTTGAT GTTTTAATTG TTGTCATAAT ATTGCCTCCT TGTGAGCAGT AATAGATTTT 5400 

GAGTATGCTG CAAGTTCTAA TGAATCTTCG ACATTTTGAA ACGGTGTGAT AATGTATAAA 5460 

25 CCATTAAAAT ATTCATGAAC AGTATCGATT AAATCCTTTG AAAGCTTAAG ACTTAGTTCT 5520 

CGTGTTTTGG CTTTATCATC TTTAACTGCT TCAAATTGTT GTAAAATTTC ATCTGACATC 5580 

TTGATTCCTG GCACTTCATT ATGCAAAAAG AGTGCGTTTT TGTAACTTGC GATAGGCATA 5640 

ATGCCTATGA AAAATGGTTT GTTCAAGTGC TTAGTGGCAT GGTAAATTTC AATGATTTTC 5700 

TCTTTGCTGT ACACGGGTTG TGTTATAAAA TAAGACATTC CGCTTTCTAT CTTTTTCTCT 5760 

AATCTTTTGA CGGCACCATA TAATTTACGA ACATTAGGGT TAAAGGCGCC AgcGATGTTG 5820 

AAGTGTGTAC GTTTCTTCAG CGCATCACCG TCAGTGTTAA TACCTTGATT AAATCTTAGA 5880 

GCGfiGTTCAG TTAATCCTTT AGAATTAACA TCATAGACAT TGGTTGCACC TGGTAAGTGA 5940 

CCAACTTTTG AAGGATCACC AGTTATGGCT AATATTTCGT TAACGCCAAT GAGCGATAAT 60 00 

CCAAGTAAAT GGGACTGCAA GCCGATTAAG TTTCGGTCTC GACATGTAAT ATGTACGAGT 6060 

GGTTCAATAT TGTAATATTG CTTAATTAAG CTAGCAGCAG CAATATTGCT AATTCTGACA 6120 

45 GTTGCCAATG AATTATCTGC GAGTGTTACC GCATCTACAT TAGCTTTATC AAGTTTAGCG 6180 

ATATTTTCAA AAAATCTATC CGTGTCTAAA TGTTTCGGTG TATCCAATTC GATAATAACG 6240 

GTTGGACGTT CTTGAACCTT AGATGTTAAT GATTGTCTAA CTTTATTTTG AGATGGATTG 6300 

50 AAAAGTGCTT TCGTTGGTAT CGGAATCACT TTTTTGTCAT TAACAGGTTT AAGTGTCTGA 6360 

ATAGATTCTT TAATAAATTT GATGTGCTCT GG CGTTGTAC CACAGCAACC ACCAATTAAA 6420 
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TACTTAAATT CACTATTTTC AATATCTAAT AAGCTGGCAT TTGGATAACA AGATAAGAAT 6540 

GCGTGCTCTG GTAATTCAAT ATGTGTGAAA GACTCTTGCA TATGGTGCGG GCCATGATGA 6600 

CAATTGAGTC CCACGATGTT TGCACCACAT TGAACGAGTT GTTTTAATCC TTCATTGATT 6660 

GCCTGACCAT TAACTAAGTA ATTTGTGTTT GAAGCGGTTA ATTGAGCAAT GATTGGAATG 6720 

TCGTATTTCT TTCTCGTTCG TGAAATGACA TTTGTTAACT CTTCTAGGTC GTAATACGTT 6780 

TCGAAAAGTA GCGCGTCAAC GCCTTCTTCA ATTAAGGTGT CTATTTGAAT TTCAGTATGA 6840 

TAAAGAATAG TTTGTAAGCT GATATCCTCT TGTTTGATAC CTCTAAACCC ACCAACTGTG 6900 

CCTAATATAT ACGTATCTTT ATTTGCTGCT TTTTTTGCGA TGCGAACGGC GGCTTGATGT 6960 

ATTGCTTTAA CTTTATCTTC AAGACCGAAT CGTTTTAACT TTTCAAAATT TGCACCATAA 7020 

GTATTGGTTT GAATGACATC AGCACCGGCT TCAATATATG AACGATGGAT GCGTTCAACT 7080 

TTATCTGGAT GGCTAAGATT ATATGCTTCT GGACAGGTGT CTAATCCTTC AGAGTATAAA 7140 

ATGGTTCCTA TAGCGCCATC AGCTACTAAA ACATTATCTT TCAATTGTGT GAGGAATTGA 72 00 

CTCATTGAAT GCCTCCTTTA ATGCGTATTT GATGTCTGCA ATGAGTTCAT CAGGATCTTC 7260 

25 GAGACCAACA CTTAATCGGA ATAGACCGAA AGTGATACCA CGTTCTTGTC TCACTTCTTC 7320 

AGGTAGTGCA GCGTGAGACA TTGTTGCTGG ATGTGAAAGG ATCGTTTCAA CACCGCCCAG 73 80 

ACTCACTGAA ACGAGTGGTA ATGTCAGTGC ATCGACAAAT TGTTGTGCTT TAGACTCATC 7440 

AGCTAAACGA AAGCCAATAA CGGCACCGCC ATTTTTAGCT TGTTCTAAAT GAGCAGTAGT 7500 

GAGTCCCGGA TAATAAACTT CTGAAATTTC ATCTTGCTTT ATTAAAAATG ACACGATTTT 7560 

TTGAGCGTTT TCGACAGATT GTTTAAATCT GATTGGAAAA GTTTTTAAAT GTTTAGCAAG 7620 

TGTCCAGCTA TCCTGAGCAG ATAACATATT GCCTGTACCA TTTTGTATTA AATAAAGAGC 7680 

GTCACTAATT GCCTCATTAT TAGTTATGAC AGCACCAGCA ATTAAATCGC TATGTCCACT 7740 

TAAAAATTTT GTAGCACTAT GAATGACAAT ATCAGCGCCA AGTAATAAAG GTGATTGACc 7800 

TAACGGTGTC ATAAATGTAT TGTCCACAGC TACCAGTAGT TCATGCTTTT CGGCTATTTT 7860 

AGAAACAGCT TTGATATCAG TAATTTTAAA ACAGGGATTC GATGGTGTTT CGATATAAAT 7920 

4S TAATTTTGTG TTTGATTGAA TGGCACCCTC GATTTGTTCG AGCTTTGTAG TATCTACGGT 7980 

TGTAAATTCA ATATTAAATC GATTCAAAAT TTGCTCAGTG AGGCGAAAAG TACCGCCATA 804 0 

TACATCATCG GGTAAGATGA CATGATCACC AGATTTGAAA GTCAAAAGTA CTGCTGAAAT 8100 

SO AGCAGCAATA CCTGATGCAA AAGCAAAAGC GAATTTTCCC TGTTCTAATC GTGCTAACTT 8160 

CTCTTCTAAA AGTTCACGGT TAGGGTTGCC CTTCGTGCAT AATCATATTT AACATCGCCA 8220 
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TCCACACCTC TACGCCAATC GAATATCACT TCTGTCTCTT TTGAAAGTGT CATACAATCT 8340 

CTCCAATCTG AGCTTTATCT AATGCTTGGA TGATATCGCG TTCGATGTCT TCATAATTTT 8400 

5 

CAACACCTAG TGATAAGCGG ATTAAATACT CATCAATGCC ACGTTTATCT TTTTCAGCAT 8460 

CTGGCATATC AACATGTGTT TGGGTGTAAG GGAAGGTCAC TAATGTTTCA GTACCTCCTA 8520 

AACTTTCTGC AAAAATGCAA ATGTCTAAAT TTTCTAATAA TTTAGCGACG CTATAGGCCT 8580 

10 

TGTTAAGTCT TAAACTAAGC ATGCCAGTTT GCCCGCTATA TAGTACTTCG TCAATTGCTT 8640 

GAAGTGACTG ACATTTTTTA GCAAGTTTTC TAGCGTTTGA TTGCGCACGC TCAATGCGTA 8700 

1S AATGCAAAGT TTTAAGTCCA CGTAACAACA AATAACTATC TATTGGTGAA AGTGTTGCGC 8760 

CAGTCATGTT GTGAAAATCA AACAACTGTT GCGCGAGTGA TTCATCTTTG ACGGTTACGA 8820 

CACCTGCTAG TACATCGTTA TGTCCGCCAA TATATTTCGT GGCTGAATGT AAGACTATAT 8880 

20 CAGCACCTTC TGCTAGTGGT GTTGAAAGAT AAGGTGTTAA AAAAGTATTG TCGATAATTG 8940 

ACAATAAGCC TTTAGCTTTA CAAAGTTGAT AGTATGGCTT TACATCAATA GCAATCATTT 9000 

GTGGGTTAGA TATTGGTTCA ATGAATAATG CAACTGTTTT ATCAGTGATT TCTTTTTCAA 9060 

25 CTTGTTCATA ATCTGTAAAA TCAACGTACT TAAATTTGAT ATCGTATTGT TGCTCGTAAA 9120 

ATTCAAATAA TCTAAATGTG CCACCATATA AATCGAATGA AACTAAAATT TCATCATGAG 9180 

GTTTAAATAG ATTACATATT AATTGAATGG CTGACATTCC ACTTGATGTA GCGAATGATG 9240 

30 

CAATACCATG CTCAAGTTTG GCAAAACAGG TTTCAAATGT TGAGCGTGTA GGATTTTTAG 9300 

TACGTGTATA ATCAAAACCT GTCGATTGTC CTAGTTTTGG ATGCTTGTAG GCAGTAGATA 9360 

AATGGATTGG ATTCGCTATA GCACCGGTTG AATCATCGGT TAATGTGATT TGGGCTAACT 9420 

35 

GTGTATCCTT CATATTAAGA CCCTCCTATA AGAAAAAATA AAAAAAGCTT CCGTCCTTCG 9480 

TACCCGAATG AATCGGATAA AAAGGACGAA AGCTTATGTT TCGCGGTACC ACCTTTATTT 9540 

GTTATTCCAT CGCTGAAATA ACCTTATTCA GTACGCATTA AAAGTAAATA TGCTTACTGA 9600 

40 

ACAATTATCA CAATTAAAGT CAGTAAGTAA GGATATAGTA ATGTGCTATC CCATACTTAT 9660 

TAACAAAAAA TCGTGCGTAA AGAATCCAGT ACGCCATTTA ACATCAATGT TAATACTGTA 9720 

45 TCGCTATAAC GGGCGAACCC GTAGACACCT CATATTGGCA TCAACACTCC AAGGCCATTT 9780 

TCAAACACGC TTTCAAAATC TTCTCTCAGC TACTAAAGAC TCTCTGTATA AGCAGGGTGT 9840 

GTTTTACTTy CCTCTTTATT GTGTTTACGT TTCATTAAAC TGTTATAAGA TATTAATTAG 9900 

50 CTTACAGAGT AAAAAAAGAT TTGTCAACAA TTATTCAGAA AATTTTGATT TAAAAGTTAA 996 0 

TTTGTTTGTG AAATTGTAAT TGGTATCTTG AAGTTGAAAA ATGAATTATT TTTTAAATAA 10020 
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TCAAATAAAA AGTGATGTGA GTGAATTGTC AAAAAGTGAA GATCAACGTA TTACTAAAAC 10140 

AAAAGATGAA CAAATTAAGC AAATAGATAT ATCGGATATC AAACCGAATC CGTATCAGCC 10200 

CCGAAAAACT TTCGATGAAA ATCATTTAAA TGATTTGGCA GATTCAATTA AGCAATATGG 10260 

AATTTTGCAA CCAATTGTGC TTAGAAAAAC AGTTCAAGGT TATTACATTG TAGTTGGTGA 10320 

AAGAAGGTTT AGAGCTTCGA AAATTGCTGG TCTAAAATAC GTATCAGCGA TTATCAAAGA 10380 

TTTAACAGAT GAAGATATGA TGGAACTGGC GGTCATCGAA AATTTACAAC GAGAAGACTT 10440 

AAATGCGATT GAAGAAGCTG AAAGTTATCA ACGTTTGATG ACAGATTTGA AAATTACACA 10500 

ACAAGAAGTA GCGAAACGAT TGAGTAAGTC GCGCCCGTAT ATAGCGAATA TGTTGAGGTT 10560 

ATTACATTTG CCGAAAAAGA TTGCTGACAT GGTAAAAGAT GGGCGACTGA CAAGTGCACA 10620 

TGGACGAACG TTATTGGCAA TTAAAGATGA ACAACAAATG CTTAGGTTAG CGAAACGGGT 10680 

TGTTAAAGAA AAGTGGAGTG TCAGATATTT AGAAAACCAT GTTAATGAAT TAAAAAATGT 10740 

TTCGTCAAAG TCGGAAACAG ACAAAGTAGA TATAACTAAG CCTAAATTTA TAAAGCAGCA 10800 

AGAACGACAG TTGCGAGAAC AGTATGGTAC CAAAGTAGAT ATATCAATAA AAAAATCGGT 10860 

TGGTAAAATC TCATTTGAGT TTGATTCACA AGAAGATTTT GTGAGAATAA TTGAACAATT 10920 

AAATCGTAGG TATGGTAAAT AGTTACACAA TTTTATATAA TAACTCTTTG TGCAAGTGTA 10980 

AATAAATTGT AATCAGTGAC ATTTGATTCT AGAT 11014 
(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6022 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



40 



45 



SO 



' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

TCCCCTTATG GAATTTCACA TTCTAGTTTA CATAATATAT ATTATAGGAA GTTATATGTG 60 

TGTAACGCAA AAgGTACCCT ACATCATAAT CATTATCTAA TATCGTCACA TAACTTACTT 120 

ATGCTATAAT CATGGTATTA TATTGTTTGG AGTGATTTGA TGAGATTTGT CTTTGATATT 180 

GATGGTACGC TTTGTTTTGA CGGCCGATTA ATTGACCAGA CTATTATTGA TACATTGTTA 24 0 

CAATTACAAC ATGATGGTCA TGAACTTATA TTTGCATCAG CACGTCCGAT TCGTGATTTG 300 

TTGCCAGTTT TACCATCAGT ATTTCATCAG CACACATTAA TTGGCGCAAA TGGTGCTATG 360 

ATTTCACAGC AATCAAAGAT TTCTGTTATC AAACCAATTC ATACTGATAC ATATCATCAT 420 



55 
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GCTGCACAAC TTGACGCTGn AGAACGCGAT 
CAGTTGTATT GATGTTGCAA ATATCGACAC 

5 CCCGGCACAA ATTACAACTA TATTAGACGA 

GATTCACCAT TCAAATGAGT ATAACATTGA 
TGCATTACAA TATATATTTG ATGCAGATGT 

10 TGATATTGTC ATGTTACAAC ATGCTAGTAG 
CACACACGCA ATATTGAAAC TTGATAAAAT 
TTGCAAAGTC TTAAAATCAT ATAAATAAAA 

15 

TCGATAGGGG CTATTTTAAT AAAATTCGTC 
AAATAATTCT GCCATTTCTC CATGTTCAAT 
TTCTTCACTC ATATCATTAA TCATTTCTTT 

20 

TTGATAGTCT TCTTCAACTT CGTTTAACAT 
AGCTGTAAAG AGTAATGCAA TCATATGTTT 

25 ACAATTACAT ATGGATTTTC TAGGATGTTC 
GCCACTGCCC TTTACTTCAG CCTCATGCTG 
GCATTCACTT TGATAATAAT TAAAGCCTCT 

30 TAATCCCATT AATGaTACTC CTTTTTATTA 
AAGTGTCTAG ATTAAAATAC TTGATTTATC 
ATTCTTAAAT AACTAATATG AAAATGcTTG 

35 AACATTAAAT AATTCCtCTA TTGCAAAATC 
AATTCTATTA ACTAAGCGTT GTAACACGGA 
TAAGTTCGTA ATCGTTTGCG CTTTTAATAT 

40 

TAAATTTGAA GTTATCTCAT CACATATTAA 
GACATTAAAT CTTTGTAATT GTCCGCCACT 
CGCGTTTAAC TCAAAAGTAG ATAAATGTTG 

45 

AGTTAGACCT CTGTAATAAT ATAACGCTTC 
GTTAAAGCTA GTTAAAGGGT GTTGGAAAAT 
CTCTCCTTTA ACAGGTTTAA ACAAGCCAAG 

50 

GCCACTTTCG CCTAAAATAC CAACATTTTC 

55 



TTTTGAGCGT 


TTAGATCCAC 


ATAAGCTGGC 


540 


GCCAATCAAG 


AkTATTTTAT 


TAAATATAGA 


600 


GCTAGATAAA 


TACCATCAAG 


AATTGGAAAT 


660 


TATAACAGCG 


CAAAATATTA 


ACAAATATAC 


720 


TAAATATATA 


GCATTTGGTA 


ATGACCACAA 


780 


TGGCTATATT 


ATAGGACCAT 


CAGAAGCATA 


840 


CAAACACATC 


AATAATAATG 


CACAAGCTAT 


900 


ACACCCCTAT 


CAAATGATAA 


TCATTATCAA 


960 


CTCGAACATT 


TCTTCCTCTT 


CATCTAATCC 


1020 


TAACATGTTT 


AAATATGCAT 


CGCGGAGTTC 


1080 


AAGACTATCA 


ATCCACATAT 


TTCTGCGTAA 


1140 


CATTATATGT 


TTATTTGCTG 


CTTCTGGACT 


1200 


ACATATCACT 


CGTCTTCCAT 


CAGCATGAGG 


1260 


CATATCAATA 


TAACAACGAT 


ATACTTTGTT 


1320 


CGTTTCTGAA 


AATGATTTTA 


AGTTAATGAC 


13 80 


TTCTATAGAA 


CGAATACTTG 


CAATATCAAG 


1440 


TTATTTTTAA 


ATAAAGAaAA 


TAAAATAGAT 


1500 


TATATTTTAT 


AACAAGTCTA 


GAATTATCGC 


1560 


CACTAATTCt 


TTTGTATAAG 


GGTGTCTATC 


1620 


ATCGACTATC 


ATGCCATCCT 


TAAGAACGAT 


1680 


TAAATCATGA 


GAAATAACGA 


TAAAATGATT 


1740 


ATTGATTACA 


TTTTGTTCAG 


CTATAACATC 


1800 


AACGCGAGGC 


TGTGCTAATA 


ACGAACGCAT 


1860 


CACTTCGCTT 


GGTAATTTAG 


TCAATAATTG 


1920 


TAATAATAAT 


TGATCCTGAG 


CAGTATTATC 


1980 


TTTTAATGAG 


GTCTCAATCG 


TCCAATCAGG 


2040 


CGGTAACACA 


GCATTGTCAC 


TTAAGTAAAT 


2100 


AACCAATGAA 


GCGAGCGTAC 


TTTTACCACA 


2160 


TCCATCAGGT 


ATAGTAATAT 


TGATATCTTG 


~ 2220 
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CCCTCTTTAA TTGTGTTCTA TATTTAATTA GACGTTCAGT ATACGGATGC AAATGCTCAT 2 340 

ACTTGAAATG ATTAATATTA CCTCGTTCAA TGATTTGACC TTCTTTTAAA ACATAAATGT 2400 

5 ACTGACAATA TTTCAATACA TGACTTAAGT TATGTGTGAT AATAAATAAT GTTTGACCAT 2460 

GTTCTAATAC AATATGCTGT AATAAATCCA TCACTTGATT ACCGTTCAAA GCATCCAATG 2520 

ATGCAACTGG TTCGTCTGCA ATGATTAATT TAGGCTCCAA CATGAGAACG CTTGCTATGT 2580 

10 ATACGCGTTC AAGTTGGCCC CCAGAAAGTT GGAAACTATA TTTATTTAAT ATATCTTTGC 2640 

TTTGTAAATT AACCCACGAC AAAGCCTTAT CAACTTTGGA CAAAGCCTCT TCTTTACTAC 2700 

CTTTATAATG CTTACGATAA ATCGCAGTTA ACTGTTTACC TAATTTAGTA TGGTCGTTAA 2760 

15 

AACTTTCTGC ATAATTTTGA GAAATATAGC CAATTGTATG ACCATAATAT TGACTCAATC 2820 

TACTAACATT TTCCCCATCA AATTGGTACG AATCATACGT GCAGCTTAAA TCAAATGGTA 2880 

AATATTCAAG TAAAGCTTTA GCAATCAAAC TTTTTCCAGC GCCGCTCTCT CCAATCAAGG 294 0 

20 

CATTAATCTG TTGACTAAAA ATTTTCAAAT CAATCCCTTT AATAAGAGAT TTCTCACTAG 3 000 

TATTCTTTAT TGTTAAATTT TGTATATCAA TGAGACTCAT CATATTCACC CCGTTGTTTC 3060 

AGCAATCTAT CTCTTAGTGC ATCACCGGTT AAATTAAAAA TTAAAATAGT TATAGCAATG 3120 

25 ' 

ACTGAAGCAG GTGCAATCAA CATAATTGGA TGAGACGAAA TAAAATCACG ACCTTGTTGC 3180 

AACATAGCGC CCCaCTCTGG TGTTGGCGGT TGTGCACCTA ACCCAATAAA TGATAGTGAA 3240 

3Q CTTATATATA GAATGATTTT ACCGAAATCA ACGACCATCA AAACGATAAT AGCCGGTATA 3300 

ATTTTAGGTG TTAAATGACG TATTAATATT GTTCTTGTTG GTACATGAAA TAATTGTGCC 336 0 

ATTTTTATAT AAGGCTTATT CATTTCGCTA TTAACTATAC TTCTAGTCAA CCTTGTGTAA 3420 

35 TTCATCCATT TTATTAATGT AATTGAGATA ACTAAATTCC ATAAAGATGG TTGAAAAAAA 3480 

CTTGcTAAAG CAATCATGAT GATAAATTCT GGAATACTTA GACCAACATC AATAAACCTT 354 0 

AACACTAATC GTTCAATCCA CCCTTTTTTG TATCCGGCAA ATAGACCTAG TGTAACACCT 3600 

40 ATGACAACGA TAGCTATTAA TGTTAAAACA GTAACAAACA ATGTTGAACG TGCACCGATA 3660 

ATAATTCGGG TAAATAAATC TCTCCCATAA TCATCAGTTC CTAATAAATG CAACCAACTA 3720 

ATAGGTTCAA AAGTTTGTGA TAAATTGACT TTGGTTGCAT TTTCACTACT GACAAAGAAT 3780 

45 TGCAGTACAA TTACCACAAA AATAAATGCA ACGAATACAA AAAATATCAG GTTATTCTTT 3840 

GAAAATATTT TATGCATGAC GGTCACTACT TTCTGATATC AATGGTGTAT TGGTTTTGAT 3900 

TTTTGGATTT CCTAATTGTA AACGCTGCTT CGGATCAAGT AATAACGTTA ATAAATCAGC 3960 

SO 

AATCGTATTG ATAATAACAA CGAAGAAGCC AATAAATAAC ACGCATCCTT GAATAACAGG 4020 

55 < ' 
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10 



15 



20 



25 



ATTTTCAATC ACTACAGTAC CACCTATTAG ACTGCCAAGT GAAATCCCTA GTAATGGGAT 4140 

AATCGGCAAA ATTGTTGGTT TTAGTAAATC ATGAATTAAA ATATAACGTT CATTCATACC 4200 

GCGTAATCTT GATGCTTGTA CGATATTACT TTGCAATAAC ATCAATAAAT TAGAACGCAC 4260 

TAAACGAATG ATGTATGCAC ACATACCTAA AGATAGCGTG ATTACAGGTA ATATAAACTG 4320 

ACTTAGTATA ACGCTATCTA TATTCATTAA ATTTGTGACA ATAAATAATA AAATAATACC 4380 

GATAAAGAAC GCTGGTAAAC TAATCGATAG TGTTGAGATC ACTCTAATCA CTTTATCCGT 4440 

CCACTTATGA AATCGTTTGG CTGCTATAAT GCCGAGCGGT ATAGATATGC ATAACGACAC 4500 

TACTAATGTT GAAAATGATA TGAGTAATGT TATGGGTGCA TAGTTGAATA ATATCTGTGT 4560 

TACCGGTTCT TTTGATTCAA AACTTTTTCC TAAATTAAAA TGTAATAAAT GATTCATCCA 4620 

ATGCCACCAC TGTACCAATA AAGAATCATT TAATCCCAAT TTATCTTTGG TTGCATTTAT 4 6 BO 

TTGTTCCGTC GACACTTGTG CTACATCAAG ATGTAATATT TTATCAACAG GATTGCCTGG 4740 

TGATAATTTC ATTAAAATGA ATGTAAGTGT AGAAATAACA AATAAAACAA CTATCATTTG 4800 

CATCAGTCTA TACAACATAG ACTTTATTAT GAACATAATA GTCCCCCTCC TTGTGTAAGT 4860 

TACTAACACT TTCTTTTTAC ATGAGAATGG CGCATGTATA TGCAACTTAC ATATTAAGAA 4920 

CTAACGTTCA TTATAGTATT ATCCATAAAG AAATTGAAGT ATATTTAATT TTTTAACAAA 4980 

ATCATTATAA AATATAATAT TTTGAATCAA GTCAACCATG TAAAATATAA AAAAGTCAAA 5040 

30 ACAAAAACAA CTATAGCACT GTATTCCATC TCTTTCGAAA TAATTGTTAC TGCAGTGTAA 5100 

CTTAAAAGTC GATGATTTTG TGCATATAGT TGTCGAATAT TATTTTTTAT CTTTACGGCG 5160 

AAGTTCAGCG CCCTCATAGC CGTATTTTTC AATTTGCTTT TCTAATTTAC GCGCTTTTCT 5220 

35 TTCTTTACGC CAATTTCTAG TAAAATACCA TAATAGAAAA CTAATTAATA AACTCATAAT 5280 

CGCTAAAAAT GCAGCGTATC CTAATAATGG TTGATATTTT ATATCTTGAA AATTTGGAAT 5340 

AAAAAATGCA AGCACACCTA ATATAACAAA TGTAATTACT GCAGATACAA ACCATTTATT 5400 

TAAAACTAAG CAACAGAATA TTGTTAATAA AATCATTATT AATGTTGTGA TCCATAAATA 5460 

ATTAGGCATA TCGAATAATG TCATATTCAT TCTCCTTTTA TTTCATTACT TTCCTTGTAT 5520 

ACATTTTATT ATAAATTTTT AAAAACTTAA ACAATAGCAG TCAGTTTCAA GCAATATTCT 5580 

ATCTACTAAT AGAAAAATCA TTGTTCCTTG CGACATGGAA ATCGTAACAT TATCGTTTAG 5640 

GAGACAAAAT TATGTATAAT GAATGTATTA TACCAAAGGA GTGATTATAT GTCTCAAGGT 5700 

TTACCTTTAA GAGAAGATGT TCCTGTTTCA GAAACATGGG ATTTAGTAGA CTTATTTAAA 5760 

GATGATCAAC AATATTATGA AAGTATTGAC GCTCTAGTAC AnCAAGCAAA TCAATTTCAT 5820 
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GAAAATATTT TAATTGCCTT AGATCGCTTA AGTAATTATG CAGAACTACG TTTAAGTGTA 



5940 



GATACTAGTA ATATCGAGGC ACAAGTATTG AGOGCTAAAT TATCTACTAC ATACGGTAAA 



6000 



ATTGTTAAGC CAATTATCCT TT 



6022 
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(2) INFORMATION FOR SEQ ID NO: 93: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 476 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

CCATCAATAA TGTATACATG ATTGGCATCA TATTCCCCTT TAATTAGAGA GCTACGTACA 60 

GTTTGTyTTA TTAAAGTAGA ACTAATAAAT AACCATCTCT TATGTGCACA AACACTTCCC 12 0 

GCAACAATTG ATTCAGTTTT ACCAACCCGT GGCATACCTC TAATGCCAAT CAACTTATGA 180 

CCTTCTTCTT TGAACAATTC AGCTAAAAAG TCTACTAACA AGCCTAAATC TTCACGCTCA 240 

AATCGAAAGG TTTTCTTATC TTTTGCATCT TGCTCAATAT ATCTTCCATG TCTTACTGCA 300 

AGACGGTCTC TTAATTCTGG TTTTTTAAGC TTTGTTATTT CAATTTCATT TATACCACGA 360 

GCTATTTGCT CAAAACGTTC AACTTTTTCA AGATTGTCTG TTTTAATTAA AAGGCCTCGT 420 

TTACCTTGAT CAACACCATT AATTGTAACA ATACTTATAC CTAACATACC TAATAA 476 
(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
- (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 



AGAAATACAA 


CGAAGCATAT 


AAATATAACC 


GATCITITIT 


CTAATTGAAT 


ATTAAGTAAG 


60 


TGTATGTACT 


TTCTGGAAGT 


AGCACCTAGT 


rGGATTGTtC 


CTCCTACAAC AGGCCAAAAA 


120 


TTTTTATTTT 


TAACTGGCTT 


AACAGTGTTC 


AGTTTTTCAT 


ACTCTTCTCT 


ACTAATTTTG 


180 


GCGCACCTTT 


TTGGAATGAA 


CCAATTAATA 


AATGGAAAAA 


AGTATACAAG 


CCAAGTTCTT 


240 


ATTACATCGA 


CCATTAAATA 


CTCATCATCA 


TACTTAATAA 


CTCTGTATTT 


CGGATTTTTA 


300 


TTGATAATTT 


CGGTTTCACA 


AAGCAATAAT 


TATCACTTCC 


TATTAATAAC 


AAATTCACAC 


360 
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TTATATGACC TTAAATATAT AACATGAATC TTTTTGTCTA TTATTGAAGA CATATTTATA 480 

AAGAAAAATA GCATTGTCAT AATAACCCAA GCAATAAATA CTATAATATT TTGGATAGAT 540 

5 

AAACTAATCA rrACATCTAA GAACATGATT gATAATCCAC CACAGAAAAA ATAAGAAAAT 600 

AGTACAAAGC AAAGATTCTT GAATGATGGA AAAATCATAA TTTTTCCATT GCTACTCCGA 660 

TCATTATAGA TAGATAACTT TACTTTCTGA TTTAAATATA TATAAAACAC TAGAATACTT 720 

10 

AATAATAAAA CCGAACAAAT GATAATAACG CAATTTTTTT CTAAATGAGA ATCAGGTATA 780 

TATATTTTAT CTCTAAACAT AGTGCCAAAT AAAAGTATGC TACCTATAGC TGGCCATAAA 840 

15 GCTTTaTTTT TAACTGGTTT GACAATATTT AAATTATCAA AATCTTCTCT GCTGATTTGG 900 

ACATATTTTT TTGGTATTAA CCAATTAATA AACGGAAAGA ACAAAACTAA CCAGGTGCTT 960 

ACTAAATCAA TCATCAGATA GTCGTTTTTA TATTTAATAA TTCTATATCT GGGATTTTTG 1020 

on 

TTTACAACTC TAACCTCGCA AAGCAATATC TCCACTTCCG TCTCGTTGGT TTTATATCTA 1080 

ATACACTTTC AGATACTTTA TAAGTGTTTT GTATTTTAGT AACATACTAT TTTCCTGTTT 1140 

ATTACTTAAC TTACGAACTA CAATCTAAGT TTAGTAATTT CTATTGCTTT TTAAGTTTGG 1200 

25 

CATAAACCTT TTTATTACTA ATTGAGCCCA TGCTTATTAG AAAGAAAAAA ATTGTAATAA 1260 

TAATCCACAT AATAAATACC AGTAGATTTT GAGGTTTTAT AGTCATTAGC CATATTAAAA 1320 

30 ATAATATAGA ACAACCTCCT AATAATAGAT ATGTGAAAAC TATAAAACTT CCATCTTTAA 1380 

AAGTAGGCAC TAATATAACC CTATTTTCAT TATCTAGATT ATCATCATAT ATCTTTAGTT 1440 

TAAGCTTTTT ATTTAAGTAA ATGTAAAATG CTGCAATACC TATAAATCCT ATAAAACATA 1500 

35 AAGATATTAA AATCTTATTA TCTAATTGAA CTTCAAACGT ATGTACATAT TTCCGTAAAA 1560 

TAACTACAAA TAAAAACGAA CTACCAGTAA CTGGCCAGAA AATATTATTT TTATTTTGTT 1620 

TATCAACATT TAAATTTTCA AGTTCCTTCT CACTAAGTTT TGCATACCTT TTGGGAATGA 1680 

40 

ACCAATTAAT AAAAGGAAAA AAGTATACAA GCCAAGTGCT TACTAAATCA ATTAACAAAT 1740 

ACTCATCATT ATATTGAACG ACTTTATATC TCGGATTTTT ATTAATAACC TTAATATTAA 1800 

AAAGCAAAAC TCACCACGCC CATTTCATTG GATTTATATG ATTGCTAATA ATATTTTTAG 1860 

45 

CTTCACTAAC AGCATTCCCA ACACTATCCA TGGATTTTTC TGTAGTTTTT TTAACAACAT 1920 

CTATACTATT ATCGATTTTA TGCCCTACCC AGTCTACTTT ATCTTTTAAT CCAAAAATAT 1980 

SO TATTTTGATA AATTAAATCT GTTCCTAATG CAAATACTGT ACTCATAGCC AAACCTGCTA 204 0 

AAATCACCCA TCCTACTGGA TTACTTCCTA AAACAAAAGT CGCTAATCCA GCTCCAACTG 2100 

CTGTCCCTGC AGATCCAGCT GCAAGCGTgC ATACCATTAT GCGACAACGC CTCTCCAAAT 2160 

55 
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CCTTTACCTA GGTATTTTCC GCCTTTTGCA AATTTACTAC CATTTTCTAT AAACACATTA 22 80 

CCTGATGTAC GTTTGACTTC CACAAATGAA TTTGGACCTG CTGGGCCTTT CACTCCACCT 234 0 

5 GCTGTATTGa TAAATACACC GAATTTACTT GcATTTATAC CGTCTTGCTC TAAAAGTGTT 2400 

GACGTAATAT CTAATCCTAT ATCTCTTTTA ATACTGTCTT TATTGTCATT TATATATTTC 2460 

AATATACTTT TCGGGATATC GTCTTCTGGA TGTTCTTTGG CATATGCCTT TATAACAGCA 2520 

10 

AAGTCTGCTT TATTTAAAGT TTCTTTCTCT GCTTTATGTT CAATTTTCCC CATAGCAACT 2580 

TTCAAATATT TTTCATGACT TGCTTTGGCC CAATCAAGTT CTTTACCTGA AGGAATATTA 264 0 

15 AATTGATTTG TTGAAAAGTT CCAAAAATTC TGCGCTTGGG TAAGTCCTTG TTGGACAATT 2700 

TTTTGAAATT CTTCAACTTC TTTAAATATT TCTGGTGATT TTTGATTAAA CTCACGCAAT 2760 

TTGCGTAGCT TCTCTTCTAA TTCATGTTTT TGTTGACCTA ATGTTCGTAT TATTTGTTGG 2820 

20 TTCGATGAAA TGGCTTGCTG ATTATCGGAA GCATGCTTTT TCAAATTGTT ATTCAAATTT 2880 

TCATATCGCG TAATTTGTTG ACTTAATGAT CTGATATCTT CTTCAAGCTC TGATTCTTTT 2940 

AAAGATATGC TATCAACCTC ACTCGTATAA CGTGACACAA AATTaTCGCA AGCTTGCTTC 3000 

25 

GTTAAATCAC TCAATGTTTT CATACTTGTT GATAATGGAA TTAACACCGT ACTAAAAAAT 306 0 

TGCTTAGCTG ACGTATACGC TTTCCCTTTA AGCGCATCAT CATTAATAAA TTGAGTAATT 3120 

GCTTTTTCCA ACGCATCATA ATTTGAATTC ATTGTTTGAC TCAAATTCCC CACACTTGAA 3180 

30 

GCTTGGTTTC GAGATCTGTC TAAATACATG TCAATACTCA TCGGCATGCT CCTTTTTCAA 324 0 

AAATATATGA TTTTCAAACT ATTTAAAATC AAATGCTTTT TACATCTACA AAGTTGTAAA 3300 

35 ATTTTAAAAC TCGGCGATGA TTATTTCTTA TGTAAAGGAG TCTAGATGCA GGTAAATTGA 3360 

GATAACATGT CGCCTTTTTT CTTATTTTAG CATATGGATA TAATGGTGTC TTTGTATATT 3420 

CGCAATTAAT CAATAAAAAT TATCTTTCAA TATTTTAATT TTATTGCGAC AACATCCTTA 34 60 

40 ACATTAAATA TATTAATATC TCAAAATATA TTCACTATTA AAATATGTCA TCAGTTGTTA 3540 

AAAGTATTTC CTCATCATGC GAAATATCAA AACGTATCTA AAATACGAAT AAGTTTATAC 3600 

AATCACACAA CATCATCATT CAAAATTTTA TTG 3633 

45 

(2) INFORMATION FOR' SEQ ID NO: 95: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2365 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 
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TGATACGAAt GCATTACAAT TCATATGCAA CATACAATTC CTTCTACAGC AAATGAAGTG SO 

AAACAAATAG TTGATGTGAC ATCTGTAGCA GAAAATGATA CGCATTAGTC ATAAAATTAA 120 

ATGGAAATGT CGATGAAGTG TATCAGCAAT TACAGCGATT AATTAAGAAT GCTAATGTCG 180 

AAGAGAGTGA GAATACTGAC AATATTAATA GTCAAGATAC AAGTTATACA CCTCAAGTAA 240 

AAGTAACAAC ACCAATTTTA GTGAAAGCAC CAATCGCTGG TCGTCGTATT TTACTTAAAG 300 

AAGTAAGAGA TTCAATTTTT AGAGAGAAAA TGGTAGGTGA AGGCTTAGCA ATCAAAGCTC 360 

ATGAAGAATC CAAAGTAATC GCACCGTTCA ATGGTTTAAT ATCTATGATT GTACCAACTA 420 

AGCATGCAGT TGGTATTCAA TCAGAAGACG GTGTGGACAT AGTCATTCAT ATTGGCGTGA 4 80 

ATACAGTTGA CTTGGAAGGT AAAGGGTTCA AGTGCTTTGT AAAGCAAAAT GATCATGTTG 54 0 

AAGCAGGGCA AACGTTGTTG CAATTCGACC AGCAATATAT ACAACAACAA GGCTACAATG 600 

20 CTGACGTTAT TGTCGTTATT AGCAACTCTG CCGATTTAGG AAAAGTAGAA CTGACAATGA 660 

ATGAAATCAT TACGACTGAA GATGTTATTT TTAAAATATT TAAAAACTAG GAGTGTGTTG 720 

TAATAATGAC AAAATTACCG CAAAATTTCA TGTGGGGTGG CGCTCTTGCC GCAAATCAAT 780 

TTGAAGGTGG ATATGATAAA GGTGGTAAAG GGTTAAGTGT AATTGATGTT ATGACGAGTG 840 

GTGCACATGG CAAAGCACGT CAGATTACAG AATCTATAGA TCCCAATCAC TATTATCCAA 900 

ATCATGAAGG TATTGATTTT TATCATCGTT ATAAGGAAGA TATTGCCTTG TTTAAAGAAA 960 

TGGGATTGAA ATGTTTACGT ACGTCGATTG CGTGGACACG TATCTTTCCG AATGGGGATG 1020 

AAGATGTGCC AAATGAAGAA GGACTCGCCT TTTATGATCG TATCTTTGAT GAATTAATTG 1080 

CACAAGGTAT TGAACCTGTT GTGACGTTAT CACATTTTGA GATGCCACTT CATTTAGCGA 1140 

AACATTATGG TGGATTTAGA AATAGAGAAG TTGTCGATTA TTTTGTGCAT TTTGCGCGTG 1200 

TTGTATTTGA AAGATATAAA GATAAAGTTA CATATTGGAT GACGTTTAAT GAAATTAATA 1260 

40 ATCAGATGGA CACATCAAAT CCTATCTTTT TATGGACGAA TTCTGGGGTA GCATTGACAG 1320 

AAAATGATAA TCCTGAAGAA GTCyTGTATC AAGTAGCACA TCATGAACTT TTAGCCAGTG 1380 

CyTTAGCAGT TCGTCTTGGT AAAGaGATtA ATCCgAaGTT TAAGATTGGr ACmATGATTt 1440 

CAmaTGTACC CmTTTATCCa TAwTCGTGTC ATCCGAAAGA TATGATGGAA GCACAAATTG 1500 

CGAATCGCTT ACGTTTCTTT TTCCCGGATG TCCAAGTGAG AGGTTATTAT CCAAGCTATG 1560 

CTAAAAAAAT GTTGGCACGA AAAGGATATG ATGTTGGATG GCAAGAAGGG GACGACAGTA 1620 

TTTTACAGCA GGGCACGGTT GATTATATTG GCTTTAGTTA TTACATGTCT ACGGCTGTAA 1680 

AACATGATGT TGATACTACA GTTGAAAACA ACATCGTCAA CGGTGGTTTG AATCATTCTG 174 0 
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GATATACATT GAATGTGTTA TATGATCGTT ATCAGTTACC ACTTTTTATT GTGGAAAATG 1860 

GTTTTGGTGC AGTTGATGAA GTGGTAGATG GACATATTCa TGATGATTAT CGCATTGAAT 1920 

ATTTAAAAGC ACATATTACA GCAGCGATAG AAGCAGTTGA TCAAGATGGT GTAGATTTAA 1980 

TCGGTTATAC ACCGTGGGGA ATCATTGATA TTGTTTCATT TACAACCGGT GAAATGAAGA 204 0 

AACGCTATGG TTTAATATAT GTTGATCGAG ATAATGATGG TCATGGCACG ATGGAACGCT 2100 

TGAAAAAAGA TTCGTTCTAT TGGTATCAAC AAGTGATAGC ATCAAATGGA GATAAATTAT 2160 

AAAGGTATAT TATAAGTATT TTAGGGTTAG AGCCCGAGAC ATAAATTAAT ATAGTAGGAC 2220 

CTACAGTGTT ATAATGGCGG gCCCCCAACA CAAAGAATTT CGAAAAGAAA TTCtAcAGGT 2280 

aATGCaAGtT GGCGGGGcCC AACACAGAGA AATTCGAAAA GAAATTCTAc AGGTAATGCA 234 0 

AGTTGGGGAA GGACAGAAAT AAATT 2365 
(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11050 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

CTGCGATACG ATTTGTTGAA AGTGGGGAAA ACAAAAAAGT TATCATTACC AATTTAGAGC 60 

AGGCATACGA AGCTTTGATT GGTAATAAAG GTACACACAT TCACATGTAG CACTTTATCA 120 

3S CGCGACAAAA CATTAAATAT GTTTCTCCGT TGATTCAAAT GAAAAAGTTG TCTGCTGACA 180 

CTTTGCAAGG TTTGAAGGAG TTTAACTTAT GACAGAAAAC TTTATTTTGG GTAGAAATAA 24 0 

TAAAXTAGAA CATGAACTAA AGGCATTAGC AGATTACATT AATATACCAT ATAGTATATT 300 

40 ACAACCATAT CAAAGTGAAT GTTTTGTCAG ACATTATACG AAAGGCCAAG TTATTTATTT 360 

TTCGCCACAA GAAAGTAGCA ATATTTACTT TTTAATTGAA GGTAACATTA TTAGAGAACA 420 

TTACAATCAA AATGGAGATG TATATCGTTA TTTTAATAAA GAGCAAGTAT TATTTCCAAT 480 

CAGTAACTTA TTTCATCCGA AAGAGGTTAA CGAATTGTGT ACAGCATTAA CCGATTGTAC 540 

AGTTCTTGGA TTGCCTAGAG AATTGATGGC CTTTTTGTGC AAAGCTAATG ATGATATATT 600 

TTTGACACTT TTTGCATTAA TAAATGATAA TGAGCAGCAA CACATGAACT ATAACATGGC 660 

ATTAACAAGT AAATTTGCTA AAGATCGAAT TATCAAATTG ATATGCCATC TATGTCAGAC 720 

AGTAGGATAC GATCAAGATG AATTTTATGA AATCAAACAG TTTTTAACTA TTCAACtCAT 780 
55 ■ , 
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50 



577 



TGAAAAACTT GTTGTTAAAG ATCATAAAAA 
TGTATGTGTT TAATATACAA TGTAAAATGA 
5 TATACGTTAG GCCTTTTTTG CTAGCATGAT 

TGTTGAAATT ACAGTAAAAT TTAAGGTGAT 

AGTAAAAAAT TTGTAATAGT GTAAAAATAG 

70 

TTTAAGTTAT ATAAATAGGA AGAAAACAAA 

ATATAATTCT TATTTCATTA TACAATTTAG 

TTTTGAAAAA AAGAATTGAT TATTTGTCGA 

15 

TTACAGTAGG TACCACATCA GTAATAGTAG 
ATCAAGCACA AGCTTCAGAA CAATCGAACG 

20 GTGCAGATTC CGAAAAAAAC AATATGATAG 

ATACATCTGA TATTAGTGCA AACACAAACA 
TGTCTACACA AACGAGCAAT ACCACTACAA 

25 AACCGACGGC AATTAAAAAT CAAGCAACTG 

AAGAAGCAAA TTCTCAAGTA GATAATAAAA 
ACAGTGAGCT TAAAAATTCT CAAACATTAG 

30 

CCAATGCGCA AGGAACTAGT AAACCAAGTG 
TTGCTGAACC GGTAGTAAAT GCTGCTGATG 
CGGCAAGTAA TTTCAAGTTA GAAAAGACTA 

35 

TTATGGCGGC AAATTTTACA GTGACAGATA 
aGTTACCAGA TAGTTTAACT GGTAATGGAG 

40 TGCCAATTGC AGACATTAAA AGTACGAATG 

TCTTGACTAA GACGTATACA TTTGTCTTTA 
ACGGACAATT TTCATTACCT TTATTTACAG 

45 ATGATGCGAA TATTAATATT GCGGATGAAA 
GTTCGCCAAT TGCAGGAATT GATAAACCAA 
GTGTAGATAC AGCTTCAGGT CAAAACACAT 

50 

AACGAGTTTT AGGTAATACG TGGGTGTATA 
GTAGCGGTAA AGTAAGTGCT ACAGATACAA 

55 
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TTGGTTAGTA AGCAAACATT TATTCAATGA 900 

ATAAGTTGAA CATGAGGTCT AACGTACATT 960 

GAATAATTTA AAATGTTAGT TAAATTTGAT 1020 

GAAAAATTTA GAACTTCTAA GTTTTTGAAA 1080 

TATATTGATT TTTGCTAGTT AACAGAaAAT 1140 

TTTTACGTAA TTTTTTTCGA AAAGCAATTG 1200 

ACTAATCTAG AAATTGAAAT GGAGTAATAT 1260 

ATAAGCAGAA TAAGTATTCG ATTAGACGTT 1320 

GGGCAACTAT ACTATTTGGG ATAGGCAATC 1380 

ATACAACGCA ATCTTCGAAA AATAATGCAA 1440 

AAACACCTCA ATTAAATACA ACGGCTAATG 1500 

GTGCGAATGT AGATAGCACA ACAAAACCAA 1560 

CAGAGCCAGC TTCAACAAAT GAAACACCTC 1620 

CTGCAAAAAT GCAAGATCAA ACTGTTCCTC 1680 

CAACGAATGA TGCTAATAGC AT AG CAACAA 1740 

ATTTACCACA ATCATCACCA CAAACGATTT 1800 

TTAGAACGAG AGCTGTACGT AGTTTAGCTG 1860 

CTAAAGGTAC AAATGTAAAT GATAAAGTTA 1920 

CATTTGACCC TAATCAAAGT GGTAACACAT 1980 

AAGTGAAATC AGGGGATTAT TTTACAGCGA 2040 

ACGTGGATTA TTCTAATTCA AATAATACGA 2100 

GCGATGTTGT AGCTAAAGCA ACATATGATA 2160 

CAGATTATGT AAATAATAAA GAAAATATTA 2220 

ACCGAGCAAA GGCACCTAAA TCAGGAACAT 2280 

TGTTTAATAA TAAAATTACT TATAACTATA 234 0 

ATGGCGCGAA CATTTCTTCT CAAATTATTG 2400 

ACAAGCAAAC AGTATTTGTT AACCCTAAGC 2460 

TTAAAGGCTA CCAAGATAAA ATCGAAGAAA 2520 

AACTGAGAAT TTTTGAAGTG AATGATACAT 2580 
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ACCAATTTAA AAATAGAATC TATTATGAGC ATCCAAATGT AGCTAGTATT AAATTTGGTG '2700 

ATATTACTAA AACATATGTA GTATTAGTAG AAGGGCATTA CGACAATACA GGTAAGAACT 2760 

5 TAAAAACTCA GGTTATTCAA GAAAATGTTG ATCCTGTAAC AAATAGAGAC TACAGTATTT 2820 

TCGGTTGGAA TAATGAGAAT GTTGTACGTT ATGGTGGTGG AAGTGCTGAT GGTGATTCAG 2880 

CAGTAAATCC GAAAGACCCA ACTCCAGGGC CGCCGGTTGA CCCAGAACCA AGTCCAGACC 2940 

10 

CAGAACCAGA ACCAACGCCA GATCCAGAAC CAAGTCCAGA CCCAGAACCG GAACCAAGCC 3000 

CAGACCCGGA TCCGGATTCG GATTCAGACA GTGACTCAGG CTCAGACAGC GACTCAGGTT 3060 

CAGATAGCGA CTCAGAATCA GATAGCGATT CGGATTCAGA CAGTGATTCA GATTCAGACA 3120 

15 

GCGACTCAGA ATCAGATAGC GACTCAGAAT CAGATAGTGA GTCAGATTCA GACAGTGACT 3180 

CGGACTCAGA CAGTGATTCA GACTCAGATA GCGATTCAGA CTCAGATAGC GATTCAGACT 3240 

20 CAGACAGCGA TTCAGATTCA GACAGCGACT CAGATTCAGA CAGCGACTCA GACTCAGATA 3300 

GCGACTCAGA CTCAGACAGC GACTCAGATT CAGATAGCGA TTCAGACTCA GACAGCGACT 3360 

CAGACTCAGA CAGCGACTCA GACTCAGATA GCGACTCAGA TTCAGATAGC GATTCAGACT 3420 

25 CAGACAGCGA CTCAGATTCA GATAGCGATT CGGACTCAGA CAGCGATTCA GATTCAGACA 34 80 

GCGACTCAGA CTCGGATAGC GATTCAGATT CAGATAGCGA TTCGGATTCA GACAGTGATT 3540 

CAGATTCAGA CAGCGACTCA GACTCGGATA GCGACTCAGA CTCAGACAGC GATTCAGACT 3600 

30 

CAGATAGCGA CTCAGACTCG GATAGCGACT CGGATTCAGA TAGCGACTCA GACTCAGATA 3660 

GTGACTCCGA TTCAAGAGTT ACACCACCAA ATAATGAACA GAAAGCACCA TCAAATCCTA 3720 

AAGGTGAAGT AAACCATTCT AATAAGGTAT CAAAACAACA CAAAACTGAT GCTTTACCAG 3780 

35 

AAACAGGAGA TAAGAGCGAA AACACAAATG CAACTTTATT TGGTGCAATG ATGGCATTAT 3840 
TAGGATCATT ACTATTGTTT AGAAAACGCA AGCAAGATCA TAAAGAAAAA GCGTAAATAC ' 3900 

40 TTTTTTAGGC CGAATACATT TGTATTCGGT TTTTTTGTTG AAAATGATTT TAAAGTGAAT 3 960 

TGATTAAGCG TAAAATGTTG ATAAAGTAGA ATTAGAAAGG GGTCATGACG TATGGCTTAT 4020 

ATTTCATTAA ACTATCATTC ACCAACAATT GGTATGCATC AAAATTTGAC AGTCATTTTA 4080 

45 CCGGAAGATC AAAGCTTCTT TAATAGCGAT ACAACTGTTA AACCATTAAA AACTTTAATG 414 0 

TTGTTACATG GATTATCAAG TGATGAAACG ACATATATGA GATATACAAG CATAGAAAGG 4200 

TATGCGAATG AACACAAATT AGCTGTGATT ATGCCCAATG TGGATCATAG CGCATATGCT 4260 

50 

AACATGGCAT ATGGTCATAG CTATTATGAT TATATTTTGG AAGTGTATGA TTATGTTCAT 4 320 

CAAATATTTC CACTTTCCAA AAAGCGTGAT GACAATTTTA TAGCAGGTCA CTCTATGGGA 4380 
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TTATCTGCTG TGTTTGAAGC GCAAAATTTA ATGGATCTAG AGTGGAATGA TTTTTCAAAA 
GAGGCCATAA TTGGCAATCT TTCAAGTGTT AAAGGAACTG AACATGATCC GTATTACTTG 
CTAGACAAAG CTGTAGCTGA AGATAAACAA ATTCCAAAAT TGCTCATTAT GTGTGGTAAA 
CAAGACTTTT TATATCAAGA CAACTTAGAT TTTATCGATT ATTTATCACG CATAAATGTT 
CCTTATCAAT TTGAAGATGG ACCAGGAGAT CATGATTATG CATATTGGGA TCAAGCGATT 
AAGCGTGCTA TAACATGGAT GGTGAATGAT TAATTATTTC TTGGAAAATA TGTGGCTGCA 
TTAAATACAC AGAGTGAGAG ATACAAACTA TTTACGCACG ACTAACATTT CTAAGTGTTT 
AAATTATTTT TGTATTAATA TGATTGGCGC AATTTGCTGA TACACAAAAA TGTTTCTCGT 
GAAACTTAGA TTTAGCTTAT AGTTTTATCA TCATTTGTAT GACTTACATT ATAAATTTTA 
TTATAATGAG GTTAACGCTT TGAAAGGAGT CATCATCATG TCGACCAATA AAAACGATTA 
TGAGCATATG TTGTTTTATT TTGCATATAA AACCTTTATT ACTACCGCTG ATGAAATTAT 
AGAGAAGTAT GGTATGAGTC GTCAGCATCA TCGTTTTTTG TTTTTTATCA ATAAATTACC 
TGGTATTACT ATTAAATCAT TACTAGAAAT ATTAGAAATT TCTAAmCAAG GATCACATGC 
AACACTTCAA AAATTAAAAG AGCAAGGTCT CATTATTGAA AAAGTTTTAG AGACTGATCG 
ACGTGTCAAA AAATTATATT CGACGGATAA AGGCGATCAA CTCATTGCTG AATTGAACAA 
GGCGCAAGAT GAATTATTGC AAAATATATA TCAACAAGTC GGTTCGGATT GGTATGATGT 
GATGGAAGCA TTGGCTAAAG GgCGACCTGG cTTTGATTTT ATTAAGCATT TGAAAGATGA 
AAAAGAAAGC TAG CATC AG A AATGTTAAAA ATCTTCGCAT TCTTAAATTT AAAAAATATG 
TCAAAAAGTG TATAATAAAA ACATATAATT TAATTGAACT CAGTTTCAAC ACATCTTAGA 
AAGGAGTTTG AATGATGAAA AAATT AG CAG TTATTTTAAC ATTAGTTGGC GGTTTATACT 
TCGCATTTAA AAAATACCAA GAACGTGTTA ACCAAGCACC TAACATTGAG TACTAAATTA 
AACCATAAAA AATTCCCGAA CACCTTGTTA TAGTGCTCGG GAATTTTTTT ATGCTTTACT 
TGAATATATC AAATATTATT TTTGCGCTTT CTGTATTTTC GATATTACCA CTAAATGATT 
CTGATCTAGG TCCGTAAGCG TAgGTATTAA CATCCTCGCC TGTATGTCCA TCGGAAGTCC 
ACCCTGTATA AGATTTATCA TTTACTGGCT TCTGAATAGC GTGTTGTAGG GCTTTTGTTT 
GCGTTTCTAC TTCTGCGGAT TTTTCGTCTT TTTCTTTTTT AAGTAGTCTT TTTAGCTTTT 
TATTCTCTTT TTTAACCTTT TTCATATCAT CTTGTGAAAA TTCAAATCCA TAACCTTCAT 
TAATAACTTT TTCAGGGTCT TCACCTTTAG • CCATTTTTTC TGTCATATAT GATCCAGAGT 
GTTTCATAGA TTTAATCGGT TGAGGATTCC ATTCGTATCC TTTATCTTTA CCAATTGTTA 
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ATTGAATGGC GTCATCGAAT GCTTTTTCAA AACCTTCCAT TTCAGACATA ACGCCTGTAA 6300 

TATCGTTGGA ATGCGCTGAT TTATCTATAG AAGCACCTTC GACCATTAAA AAGAATCCTT 6360 

TTTTATTGCG CTCAAGCTTA CTAAGTGCAC TTTGTTGCAT ATCAGCTAAT GATGGTTCGT 64 2 0 

CTTTAGAAGC ATCTATTGCA AGTGGCATAT TTTTATCTGC AAACAAACCA AGAACTTTAT 64 80 

CTTTATCAGA TTTTGATAAC TCCTTACTGT TCGTGGCAAG GTCGTAACCA TCTTTTTTGA 654 0 

ATTTTTTATC TAAATTGCCA TTACTTTTAC CGAAATATTT AGCGCCGCCG CCTAATAAAA 6600 

CATCAACTTT ATGCTTTCCG TTGATTTTAT CTTTATAAAA TTGTTTAGCG ATTTCGTTTT 6660 

TATCATCTCT AGAAGTCACG TGTGCAGCAT ATGCTGCTGG TGTTGCATCT GTTAATTCAG 6720 

CTGTTGAAAC AAGACCAGTC GACTTACCTT TTTCTTTTGC ACGTTCAAGC ACCGTCTTTA 6780 

CTTTCTGCTT GTTACTGTCA ACACCGATGG CACCATTATA TGTCTTATGA CCAGAACTAA 684 0 

2 0 AGGCTGTTCC GCCAGCTGCA GAATCAGTAA TATTCTGTTT TGGGTCATTT GAATATGTAC 6900 

GATTTGTGCC TTTTAAATAT GAATCAAAAG CAGTAGGGGT CATTTCTTTA GCATGCGGAT 6960 

CATTTTTATA ATAACGATAA GCTGTGTTAA ATGATGGACC CATGCCATCG CCAACTAAAA 7020 

25 AGATAACATT TTTTGGATTT TTAGTATTAC CAACCGCGAA ACTTTCATCT TTAGAACTTT 7080 

TATCGGATTG CGCAATTGCA GGTGTGACAG AACTAAAAAC CGTTGACACG ATAATAAGGT 714 0 

TAGCAACTGC AAATTTTGTG GCTTTTTTAA CTGATAACAT AAGACATCCT CCTGAGTATA 7200 

TGACTATGTC TTCAGTGTAA AAGAGGAATT TtGAGCAATT ATGTAGTTTT AGTTAnAAAT 7260 

ATGTAAACAG AGTGATTTAG AATAACAAAA aATGAATATA TATGACAATT TGTTATAGAA 7320 

AGCGTTAGAA TAGAAGCGTG TGAAAATATA GAATTAAATA TAATTTGAGG TGGAAAAATG 7380 

ATACTAGTAA TGTTATCTCC ATTATTAATC ATATTCTTTA TAGTGTTGTC TATTTTAGAA 7440 

GAGCGTAAAC GTACGAAGAA AAAGCAACTC GAGAAAGAAA AAGCAAATAC ACTAAATCAA 7500 

AATACAAATG ACACGGAAAG TTCAAATCAA GAGCCGTCAT TGCAGCAGGA TAAAGAACAA 7560 

AAAGATAACA AAGGATAATT CAATTGAAGG AAGAAGATTA TAGATGAAAA TATTAATTGT 7620 

TGAAGATGAT TTTGTTATAG CAGAGAGTTT AGCATCTGAA CTTAAAAAAT GGAATTACGG 7680 

TGTTATTGTC GTTGAACAAT TTGATGATAT ACTGTCTATC TTTAACCAAA ATCAACCTCA 7740 

GCTTGTATTG CTAGATATTA ATTTGCCAAC GTTAAATGGT TTTCATTGGT GTCAAGAAAT 7800 

CCGAAAAACA TCTAATGTGC CAATTATATT TATTAGTTCC CGTATTGATA ATATGGACCA 7860 

AATTATGGCA ATACAAATGG GGGGAGATGA TTTTATCGAA AAGCCATTTA ACTTGTCATT 7920 

AACGATTGCC AAAATTCAAG CATTATTGAG ACGAACTTAT GACTTGTCAG TAGCTAATGA 7980 
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ACAAAACATA CAGCTATCTT TGACTGAATT ACAAATATTA AAGTTATTAT TTCAAAATGA 8100 

AGaTAAATAT GTAAGTAGrA CTGCTTTAAT TGaAAAATGT TGGGaATCAG AAAACtTCAT 8X60 

AGATGATAAC ACATTAGCTG TTAACATGAC GCGCCTGCTG AAAAAATTAA ATACTATTGG 8220 

CGTTAATGAT TTTATCATTA CAAAGAAAAA TGTCGGATAT AAAGTATAGG GTGAATGCAA 8280 

TGACCTTTCT TAAAAGTATT ACTCAGGAAA TAGCAATAGT CATAGTTATT TTTGCTTTGT 8340 

TTGGCTTAAT GTTTTACCTG TATCATTTGC CATTAGAAGC ATATTTACTA GCACTTGGCG 8400 

TTATTTTATT ATTATTACTC ATATTCATAG GTATTAAATA TTTAAGTTTT GTAAAAACTA 8460 

TAAGCCAACA ACAACAAATT GAAAACTTAG AAAATGCGTT GTATCAGCTT AAAAATGAAC 8520 

AAATTGAATA TAAAAATGAT GTAGAGAGCT ACTTTTTAAC ATGGGTACAT CAAATGAAAA 8580 

CACCCATTAC TGCAGCACAA CTGTTACTTG AAAGAGATGA GCCTAATGTT GTTAATCGTG 864 0 

20 TTCGTCAAGA GGTTATTCAA ATTGaTAACT ATACAAGTTT AGCACTTAGT TATTTAAAGT 8700 

TATTAAATGA AACTTCTGaT ATTTCTGTCA CTAAAATTTC GATTAATAAT ATCATTCGCC 8760 

CAATTATTAT GAAATATTCA ATACAGTTTA TTGATCAAAA AACAAAAATC CATTATGAAC 882 0 

CTTGTCATCA CGAAGTATTA ACTGACGTTA GATGGACCTC TTTAATGATA GAACAATTAA 8880 

TAAATAATGC ACTTAAGTAT GCGAGAGGTA AAGATATATG GATTGAATTT GATGAGCAAT 8940 

CCAATCAATT ACACGTAAAA GATAATGGTA TCGGTATTAG TGAAGCGrAC TTGCCTAAAA 900 0 

TATTTGATAA GGGCTATTCA GGTTATAATG GCCAGCGCCA AAGTAACTCA AGTGGGaTTG 9060 

GTTTATTTAT CGTAAAACAA ATTTCAACAC ACACAAACCA TCCTGTTTCA GTCGTATCTA 9120 

AACAAAATGA GGGTACAACA TTTACGATTC AATTTCCAGA TGAATAAAAA CTTTCAATAT 9180 

TGTAAGTATA CTAGTAACAT TTTTTTACTA ATTTAAATGT TATTAGTATT TTTTTGTTTT 924 0 

AAT^TAGAAC TAACAAAGAA ATGAGGTGCA TGCCATGTTG CTAGAAGTGn AACATGTAAA 93 00 

40 AAAGGTTTAT GGTAAAGGTT TGAATGCTAC GACAGCACTT AATCAAATGA ATTTATCAGT 9360 

TGGAGCTGGT GaATTTGTTG CaATTATGGG TGAGTCTGGG tCAGGGAAGT CTACACTACT 942 0 

AAATTTAATT GCtTCTTTTG ATGGACTAAC TGAAGGTGAC ATTATTGTGG ATGGCGCACA 94 80 

TTTAAATAAT ATGAAAAATA AAAGTAAAGC ATTGTATCGT CaACAAATGG TAGGTTTTGT 954 0 

TTTTcAAGAT TTTAATCTTT TACCAACAAT GACGAATAAA GAAAATATAA TGATGCCATT 9600 

AATTTTAGCT GGTGCTAAAC GAAAAGATAT AGAACAAAGG GTACATCAGT TGGCAGTACA 9660 

ATTACATTTA GAGGGATTCT TAAACAAGTA TCCTTCTGAA ATCTCTGGGG GTCAGAAGCA 9720 

ACGCATTGCC ATTGCACGTG CATTAGTTAC TAAGCCGACG ATTTTACTAG CCGATGAACC 9780 
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TCAATTGGAA CAGACAATTT TAATGGTAAC TCATTCAAAT ATCGATGCGT CTTATGCAGA 9900 

GCGAGTCATT TTTATTAAAG ATGGGCGTCT ATATCATGAA ATATATCGTG GTGAAGAAAG 9960 

5 TCAATTAGCT TTTCAACAAC GAATAACAGA TAGCTTAGCA CTTGTGAATG GAGGAAGTGT 10020 

CAATATATGA AGTTAAGATT GTTATGnACA TAGTGCGACG TCAATTTATT ACGCAGCGAC 10080 

TTGTAATCAT TCCATTCATT TTAGCGGTAA GTGTACTATT CATGATTGAA TATACGCTTG 10140 

10 

TGTCAATTGG GTTAAATAGC TACATAAAAC AGAAGAATGA CTTCCTAGTA CCATTTATTA 10200 

TCATAGCTAA TTTTTTTATG GCGCTTTTAA CTTTTATTTT TATTTTCTAT GCAAATCACT 10260 

TTATGATGTC ACAAAGACGA AAAGAGTTTA GCATTTTTAT GACATTGGGC ATGACCAAGA 10320 

15 

AAAGTATGCG TTTAATTGTA GTGATGGAAA CTATCTTACA ATTTGTGATA ATTTCAGTCG 10380 

TTAGTATTGC CGGCGGATAC TTACTTGGTG CGATATTTTT CTTGTTTATA CAGAAAATAA 10440 

20 TGGGCAGTGA AGTTGCGACG TTAAGGTATT ATCCATTTGA CTCTGTAGCG ATGTTTATTA 10500 

CTTTGATTAT CATTGCTGTA TTAATGGGCA TGCTACTTAT ATTCAACTTG TTTAGTATTA 10560 

ATTTTCAACG GCCGATAACT TATCAACATC GTTCCGATTC TAGTGTCATA TCACGATGGT 10620 

25 TGCGTTACGT TTTAATTGTT ATAGGAAGCG CAnACTATAT TTAGGTTACT TTATTGCATT 10680 

ACAACAAGAT ACGACGTTTG GTGCCTTTTT TAAAATATGG ATTGTCATAG GATTAGTTAT 10740 

TATCGGTACT TATGCATTTT TTGTAGGTAT AAGTGAAATA ATTATTAGTA TATTGCAGCA 10800 

30 

GGTATCAAAA GTTTACTATC ATCCACGGTA TTTTTTTGTG GTAGTTGGGA TGCGTGTACG 10860 

TCTTAAAATG AATGCAGTCA GTCTTGCAAC AATCACTTTG CTGTGTACAT TTTTGATTGT 10920 

AACGCTCACA ATGACATTAA CAACCTATCG TGATATGAAT CATACCATTA CGAAATTGAT 10980 

35 

TACGAATGAT TakGATTTGT CATTTAGCGA CAATTCTAAG TCACAAaTAG AACGTCAACA 11040 

AACAATTGAG 11050 

40 (2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 983 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 
45 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 97: 

SO 

CGACATAACG AGGCAAGGGT ACATGATACT TTAGCCTCGT TTTTGATATG TATTTTTCTG 60 
AATATAAGGG CAATAGATGG TATTTTATAw TTTTTTTAAG GTAGTGATTA ACATAGATAT 120 
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TCAAGCGGAA CAGCATTATG CACCAGTATT AACGCATTTT TTAGATCCAA GAGGGCAATA 24 0 

TATATTGGAA GTGATTTGTG GCAGTTATGA AGATTTAAAC GTATCTTTTT ATGGTGGACC 3 00 

TAATGCTGAA AGAAAAAGAG CAATCATTTC GCCGAACTAT TATGAACCTA AAGAAAGCGA 360 

CTTTGAATTA ACTTTAATGG AAATAGATTA TCCTGAAAAA TTCGTCACTT TAAAACATCA 420 

ACATATTTTA GGGACATTAA TGTCTTTAGG TATCGAACGC GAACAAGTTG GAGATATAAT 480 

TGTGaATGAA CGAATTCAAT TTGTTTTGAC AAGTAGATTG GAATCATTTA TTATGTTAGA 540 

ATTACAACGT ATTAAAGGCG CATCAGTTAA ACTTTATACT ATTCCAGTAA CAGATATGAT 600 

ACAATCTAAT GAGAATTGGA AAAATGAAAG TGCaCAGTTA GTTCTTTAAG GTTAGATGTT 660 

GTTATTAAAG AAATGATACG TAAATCACGT ACGATTGCGA AACAACTAAT CGAAAAAAAA 720 

CGTGTTAAAG TGAATCACAC TATTGTTGAT TCAGCAGATT TTCAATTACA AGCAAATGAT 780 

20 TTAATATCCA TCCAAGGTTT TGGTAGAGCA CACATTACTG ACTTAGGTGG TAAAACTAAA 840 

AAAGATAAAA CGCACATTAC CTATAGAACA TTATTCAAAT AGTAATGATT TAAGGAGGAT 900 

AACAAATGCC TTTTACACCA AATGAaATTA AGAATAAAGA GTTTTCACGT GTaAAGAATG 960 
GTTTTAGAAC CTACTGnAGT TGG 
(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10322 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



983 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
TTTTGCAAAG CTTATTTTAT GTCAAACAGA TAGTCAATGT GAAACAAAGG TTAGTACATA 60 
40 TAATCATCCA GACTTTATGT ATATATCAAC AACTGAGAAT GCAATTAAGA AAGAACAAGT 120 
TGAACAACTT GTGCGTCATA TGAATCAACT TCCTATAGAA AGCACAAATA AAGTGTACAT 180 
CATTGAAGAC TTTGAAAAGT TAACTGTTCA AGGGGAAAAC AGTATCTTGA AATTTCTTGA 240 
AGAACCACCG GACAATACGA TTGCTATTTT ATTGTCTACA AAACCTGAGC AAATTTTAGA 3 00 

CACAATCCAT TCAAGGTGTC AGCATGTATA TTTCAAGCCT ATTGATAAAG AAAAGTTTAT 
AAATAGATTA GTTGAACAAA ACATGTCTAA GCCAGTAGCT GAAATGATTA GTACTTATAC 
TACGCAAATA GATAATGCAA TGGCTTTAAA TGAAGAATTT GATTTATTAG CATTAAGGAA 
ATCAGTTATA CGTTGGTGTG AATTGTTGCT TACTAATAAG CCAATGGCAC TTATAGGTAT 



360 
420 
480 
54 0 



55 



584 



EP0 786 519 A2 



w 



GAATGGTTTC TTCGAAGATA TCATACATAC AAAGGTAAAT GTAGAGGATA AACAAATATA 6 SO 

TAGTGATTTA AAAAATGATA TTGATCAATA TGCGCAAAAG TTGTCGTTTA ATCAATTAAT 720 

TTTGATGTTT GATCAACTGA CGGAAGCACA TAAGAAATTG AmTCAAAATG TAAATCCAAC 780 

GCTTGTATTT GAACAAATCG TAATTAAGGG TGTGAGTTAG ATGCCAAATG TAATAGGTGT 840 

TCAGTTTCAA AAAGCGGGAA AATTAGAATA TTATACACCT AATGATATAC AAGTAGATAT 900 

AGAAGACTGG GTAGTTGTCG AATCTAAAAG AGGCATAGAG ATAGGTATTG TTAAAAATCC 960 

ATTAATGGAT ATTGCTGAAG AGGATGTTGT GTTACCTCTT AAAAATATTA TTCGCATTGC 1020 

15 TGATGACAAA GATATTGATA AATTTAATTG TAATGAACGA GATGCTGAAA ATGCATTAAT 1080 

ACTATGTAAA GACATTGTAA GAGAACAAGG TTTGGACATG CGTTTAGTCA ATTGCGAATA 1140 

TACATTAGAT AAATCGAAAG TTATTTTTAA TTTTACGGCG GATGATCGTA TTGATTTTAG 1200 

20 AAAATTAGTA AAAATATTAG CGCAACATTT AAAAACACGT ATCGAGTTGA GACAAATTGG 1260 

TGTAAGGGAT GAAGCCAAAT TGCTTGGCGG TATCGGACCT TGTGGTAGGT CGTTATGTTG 1320 

TTCTACATTT TTAGGGGATT TTGAACCAGT ATCGATTAAG ATGGCTAAGG ATCAAAATTT 1380 

ATCATTAAAT CCAACTAAAA TTTCTGGTGC ATGTGGTCGT TTGATGTGTT GTTTAAAATA 1440 

TGAAAATGAC TATTATGAGG AAGTACGTGC ACAATTACCT GATATTGGTG AAGCAATTGA 1500 

AACGCCTGAT GGTAACGGGA AAGTAGTTGC TTTAAATATA TTAGACATTT CTATGCAGGT 1560 

GAAGCTTGAG GGACATGAAC AGCCACTTGA ATATAAATTA GAAGAAATAG AAACTATGCA 1620 

TTAAGGAGGC ATTATTACAT TTGGATCGCA ATGAAATATT TGAAAAAATA ATGCGTTTAG 1680 

35 AAATGAATGT CAATCAACTT TCAAAGGAAA CTTCAGAATT AAAGGCACTT GCAGTTGAAT 174 0 

TAGTAGAAGA AAATGTAGCG CTTCAACTTG AAAATGATAA TTTGAAAAAG GTGTTGGGCA 1800 

ATGATGAACC AACTACTATT GATACTGCGA ATTCAAAACC AGCAAAAGCT GTGAAAAAGC I860 

40 CATTACCAAG TAAAGATAAT TTGGCTATAT TGTATGGAGA AGGATTTCAT ATTTGTAAAG 1920 

GCGAATTATT TGGAAAACAT CGACATGGTG AAGATTGTCT GTTCTGTTTA GAAGTTTTAA 1980 

GTGATTAATC AAGCACACTC AAATAGTGTT ATAATTATAA ATGAATATGG TTTGGATAAG 2 040 

TCTGAGACAA TGCATGTTTC AGGCTTTAAT TGTGTATAAA GTTTTGGTGA TTGCATAAGA 2100 

GATGGCGGTA CTAAATGTTA TTATTAAGTG TGCACGCAgT ATCaTTAGTT ATAAAATGTA 2160 

GCTGTTAAAA GTCAAAAATA CATCGAATGT AGTTAGGCAT ATAATATAAA AAGAGTTTTC 2220 

AATTACTCAA TAGAAAAAGG TTGTCTTCAT AGGAGTTAAA AATGTTAAAA GAGAATGAAC 2280 

GATTTGATCA ACTAATCAAA GAAGATTTTA GTATTATTCA AAATGATGAT GTTTTTTCAT 234 0 
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TGGACTTATG TTCAGGCAAT GGGGTGATAC CCTTGTTATT GTTTGCGAAA CATCCACGAC 24 60 

ATATAGAAGG TGTTGAGATT CAAAAAACAC TTGTCGATAT GGCGCGACGC ACATTTCAAT 2520 

TCAATGATGT TGATGAATAT TTAACAATGC ATCACATGGA TTTGAAAAAC GTTACTAAAG 2580 

TATTTAAACC TTCACAATAT ACTTTAGTAA CGTGTAATCC GCCTTATTTT AAAGAGAATC 2640 

AGCAACACCA ACATCAAAAA GAAGCACATA AGATAGCGAG ACATGAGATT ATGTGTACAC 2700 

TTGAAGATTG CATGATTGCA GCCCGTCATT TATTAAAAGA AGGTGGCAGG CTAAACATGG 2760 

TACATCGTGC AGAGAGACTA ATGGATGTCT TGTTTGAAAT GAGAAAAGTG AATATTGAAC 2820 

1S CTAAGAAAGT CGTTTTTATA TATAGTAAAG TAGGGAAATC AGCACAAACG ATAGTAGTAG 2880 

AAGGTCGAAA AGGTGGAAAT CAAGGTTTAG AAATCATGCC CCCATTTTAT ATTTATAATG 2940 

AAGATGGTAA TTATAGCGAA GAAATGAAGG AAGTATATTA TGGATAGTCA TTTTGTATAT 3000 

20 ATTGTAAAAT GTAGTGATGG AAGTTTATAT ACAGGATACG CTAAAGACGT TAATGCACGT . 3060 

GTTGAAAAAC ATAACCGAGG TCAAGGAGCC AAATATACGA AAGTAAGACG TCCGGTGCAT 3120 

TTAGTTTATC AAGAAATGTA TGAGACAAAG TCTGAAGCAT TGAAGCGTGA ATATGAAATT 3180 

AAAACTTATA CCAGACAAAA GAAATTGCGA TTAATTAAGG AGCGATAGTA TGGCTGTATT 3240 

ATATTTAGTG GGCACACCAA TTGGTAATTT AGCAGATATT ACTTATAGAG CAGTTGATGT 3300 

ATTGAAACGT GTTGATATGA TTGCTTGTGA AGACACTAGA GTAACTAGTA AACTGTGTAA 3360 

TCATTATGAT ATTCCAACTC CATTAAAGTC ATATCACGAA CATAACAAGG ATAAGCAGAC 3420 

TGCTTTTATC ATTGAACAGT TAGAATTAGG TCTTGACGTT GCGCTCGTAT CTGATGCTGG 34 80 

35 ATTGCCCTTA ATTAGTGATC CTGGATACGA ATTAGTAGTG GCAGCCaGAG AAGCTAATAT 3540 

TAAAGTAGAG ACTGTGCCTG GACCTAATGC TGGGCTGACG GCTTTGATGG CTAGTGGATT 3600 

ACCTTCATAT GTATATACAT TTTTAGGATT TTTGCCACGA AAAGAGAAAG AAAAAAGTGC 3660 

40 TGTATTAGAG CAACGTATGC ATGAAAATAG CACATTAATT ATATACGAAT CACCGCATCG 3720 

TGTGACAGAT ACATTAAAAA CAATTGCAAA GATAGATGCA ACACGACAAG TATCACTAGG 3780 

GCGTGAATTA ACTAAGAAGT TCGAACAAAT TGTAACTGAT GATGTAACAC AATTACAAGC 384 0 

ATTGATTCAG CAAGGCGATG TACCATTGAA AGGCGAATTC GTTATCTTAA TTGAAGGTGC 3900 

TAAAGCGAAC AATGAGATAT CGTGGTTTGA TGATTTATCT ATCAATGAGC ATGTTGATCA 3960 

TTATATTCAA ACTTCACAGA TGAAACCAAA ACAAGCTATT AAAAAAGTTG CTGAAGAACG 4020 

ACAACTTAAA ACGAATGAAG TATATAATAT TTATCATCAA ATAAGTTAAT CACTTTATCG 4 080 

ATTaTATGAA ATTTTAAACG ATTTTATAAA CGCAAGCTGT AATTTTAAAT GGTAAGTTAT 4140 
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GTTTTTTAAT GTAAAATAAA TACATTGAAA GTAATAAATA CCTTAACATT GAATAAGATG 4260 

AAAATGAGAT GACGAGATAA ATGTTCGCGT CCGTTGAAAT GCATAGAAAT CTTAGATATT 4320 

ATTTGAAGTG AGACATTACG AGGAGGAACA GTTATGGCTA AAGAAACATT TTATATAACA 4380 

ACCCCAATAT ACTATCCTAG TGGGAATTTA CATATAGGAC ATGCATATTC TACAGTGGCT 4440 

GGAGATGTTA TTGCAAGATA' TAAGAGAATG CAAGGATATG ATGTTCGCTA TTTGACTGGA 4 500 

ACGGATGAAC ACGGTCAAAA AATTCAAGAA AAAGCTCAAA AAGCTGGTAA GACAGAAATT 4560 

GAATATTTGG ATGAGATGAT TGCTGGAATT AAACAATTGT GGGCTAAGCT TGAAATTTCA 4 620 

, 5 AATGATGATT TTATCAGAAC AACTGAAGAA CGTCATAAAC ATGTCGTTGA GCAAGTGTTT 4680 

GAACGTTTAT TAAAGCAAGG TGATATCTAT TTAGGTGAAT ATGAAGGTTG GTATTCTGTT 474 0 

CCGGATGAAA CATACTATAC AGAGTCACAA TTAGTAGACC CACAATACGA AAACGGTAAA 4800 

20 ATTATTGGTG GCAAAAGTCC AGATTCTGGA CACGAAGTTG AACTAGTTAA AGAAGAAAGT 4860 

TATTTCTTTA ATATTAGTAA ATATACAGAC CGTTTATTAG AGTTCTATGA CCAAAATCCA 4920 

GATTTTATAC AACCACCATC AAGAAAAAAT GAAATGATTA ACAACTTCAT TAAACCAGGA 4980 

CTTGCTGATT TAGCTGTTTC TCGTACATCA TTTAACTGGG GTGTCCATGT TCCGTCTAAT 5040 

CCAAAACATG TTGTTTATGT TTGGATTGAT GCGTTAGTTA ACTATATTTC AGCATTAGGC 5100 

TATTTATCAG ATGATGAGTC ACTATTTAAC AAATACTGGC CAGCAGATAT TCATTTAATG 5160 

GCTAAGGAAA TTGTGCGATT CCACTCAATT ATTTGGCCTA TTTTATTGAT GGCATTAGAC 5220 

TTACCGTTAC CTAAAAAAGT CTTTGCACAT GGTTGGATTT TGATGAAAGA TGGAAAAATG 5280 

35 AGTAAATCTA AAGGTAATGT CGTAGACCCT AATATTTTAA TTGATCGCTA TGGTTTAGAT 534 0 

GCTACACGTT ATTATCTAAT GCGTGAATTA CCATTTGGTT CAGATGGCGT ATTTACACCT 5400 

GAAGCATTTG TTGAGCGTAC AAATTTCGAT CTAGCAAATG ACTTAGGTAA CTTAGTAAAC 5460 

CGTACGATTT CTATGGTTAA TAAGTACTTT GATGGCGAAT TACCAGCGTA TCAAGGTCCA 5520 

CTTCATGAAT TAGATGAAGA AATGGAAGCT ATGGCTTTAG AAACAGTGAA AAGCTACACT 5580 

GAAAGCATGG AAAGTTTGCA ATTTTCTGTG GCATTATCTA CGGTATGGAA GTTTATTAGT 5640 

AGAACGAATA AGTATATTGA CGAAACAACG CCTTGGGTAT TAGCTAAGGA CGATAGCCAA 5700 

AAAGATATGT TAGGCAATGT AATGGCTCAC TTAGTTGAAA ATATTCGTTA TGCAGCTGTA 5760 

TTATTACGTC CATTCTTAAC ACATGCGCCG AAAGAGATTT TTGAACAATT GAACATTAAC 5820 

AATCCTCAAT TTATGGAATT TAGTAGTTTA GAGCAATATG GTGTGCTTAA TGAGTCAATT 5880 

ATGGTTACTG GGCAACCTAA ACCTATTTTC CCAAGATTGG ATAGCGAcGG AnAATTGCAT 5940 
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AACCTCAAAT TGATATTAAA GACTTTGATA AAGTTGAAAT TAAGGCAGCA ACGATTATTG 6060 

ATGCTGAACA TGTTAAGAAG TCAGATAAGC TTTTAAAAAT TCAAGTAGAC TTAGATTCTG 6120 

AACAAAGACA AATTGTATCA GGAATTGCCA AATTCTATAC ACCAGATGAT ATTATTGGTA 6180 

AAAAAGTAGC AGTTGTTACT AACCTGAAAC CAGCTAAATT AATGGGACAA AAATCTGAAG 6240 

GTATGATATT ATCTGCTGAA AAAGATGGTG TATTAACCTT AGTAAGTTTA CCAAGTGCAA 63 00 

TTCCAAATGG TGCAGTGATT AAATAACTGT ATTTTTAAAA ATTAGGAGAG ATAATTATGT 6360 

TAATCGATAC ACATGTCCAT TTAAATGATG AGCAATACGA TGATGATTTG AGTGAAGTGA 6420 

TTACACGTGc TAGAGAAGCA GGTGTTGATC GTATGTTTGT AGTTGGTTTT AACAAATCGA 6480 

CAATTGAACG CGCGATGAAA TTAATCGATG AGTATGATTT TTTATATGGC ATTATCGGTT 6540 

GGCATCCAGT TGACGCAATT GATTTTACAG AAGAACACTT GGAATGGATT GAATCTTTAG 6600 

CTCAGCATCC AAAAGTGATT GGTATTGGTG AAATGGGATT AGATTATCAC TGGGATAAAT 6660 

CTCCTGCAGA TGTTCAAAAG GAAGTTTTTA GAAAGCAAAT TGCTTTAGCT AAGCGTTTGA 6720 

AGTTACCAAT TATCATTCAT AACCGTGAAG CAACTCAAGA CTGTATCGAT ATCTTATTGG 6780 

AGGAGCATGC TGAAGAGGTA GGCGGGATTA TGCATAGCTT TAGTGGTTCT CCAGAAATTG 6840 

CAGATATTGT AACTAATAAG CTGAATTTTT ATATTTCATT AGGTGGACCT GTGACATTTA 6900 

AAAATGCTAA ACAGCCTAAA GAAGTTGCTA AGCATGTGTC AATGGAGCGT TTGCTAGTTG 6960 

AAACCGATGC ACCGTATCTT TCGCCACATC CGTATAGAGG GAAGCGAAAT GAACCGGCGA 7020 

GAGTAACTTT AGTAGCTGAA CAAATTGCTG AATTAAAAGG CTTATCTTAT GAAGAAGTGT 7080 

35 GCGAACAAAC AACTAAAAAT GCAGAGAAAT TGTTTAATTT AAATTCATAA AGTTAAAAGT 7140 

GAGAAAGATC ACCGCCATAA ATGTAAACGA TGCTATATTC GTTTAATATG CTATGGTTCT 7200 

TTCTCACTTT TTTAAATTAA AATATCGTGC ATGTGGAATA CGTGCGATAG AGATGGTTAG 7260 

AGCTTTGAAA TTAAGAATTG TAGGAAGGCG TTTTAAATGA AAATCAATGA GTTTATAGTT 7320 

GTAGAAGGAC GAGATGATAC TGAGCGTGTT AAACGAGCTG TTGAATGTGA TACGATTGAA 7380 

ACGAATGGTA GTGCCATCAA CGAACAAACT TTAGAAGTAA TTAGAAATGC TCAACAAAGT 7440 

CGAGGCGTTA TTGTATTAAC AGATCCAGAT TTCCCAGGAG ATAAAATTAG AAGTACAATT 7500 

ACTGAACATG TCAAAGGTGT TAAACATGCG TATATTGATA GAGAAAAAGC TAAAAATAAA 7560 

AAAGGGAAAA TTGGTGTTGA ACATGCCGAC TTAATTGATA TTAAAGAAGC GTTAATGCAT 7620 

GTTAGTTCAC CCTTTGATGA AGCTTATGAA TCAATTGATA AATCTGTGCT AATAGAGTTG 7680 

GGGTTAATTG TTGGGAAAGA TGCAAGGCGC CGTAGAGAAA TTTTAAGTAG AAAATTGCGA 7740 
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GCGGATGTAA GGCAAGCTTT AGAAGATGAA TGAGGAAGTG AAAATGTTGG ATAATAAAGA 7860 

TATTGCAACA CCATCAAGAA CGCGAGCGTT GTTAGATAAA TATGGCTTTA ATTTTAAAAA 7920 

5 

AAGTTTAGGA CAGAACTTTT TGATAGATGT GAATATCATT AATAATATCA TTGATGCAAG 7980 
TGATATTGAT GCACAAACTG GGGTGATTGA AATTGGTCCA GGCATGGGGT CATTGACAGA - 804 0 

ACAATTGGCC AGACATGCTA AAAGAGTATT GGCATTTGAA ATTGATCAAC GTTTAATACC 8100 

10 

TGTATTAAAT GATACACTAT CACCTTATGA TAATGTGACG GTGATTAATG AAGATATTTT 8160 

AAAAGCGAAT ATTAAAGAAG CTGTTGAAAA TCATTTACAA GATTGTGAAA AAATAATGGT 8220 

75 TGTTGCAAAC CTGCCGTACT ATATTACGAC GCCAATTTTA TTAAATTTGA TGCAACAAGA 8280 

TATACCAATT GATGGCTACG TGGTGATGAT GCAAAAAGAA GTGGGCGAAC GCTTAAATGC 834 0 

TGAAGTAGGT TCAAAAGCAT ATGGTTCGTT ATCAATTGTC GTACAATACT ATACAGAGAC 8400 

20 TAGTAAAGTA TTAACGGTAC CTAAATCTGT ATTTATGCCA CCACCTAATG TTGATTCAAT 8460 

AGTTGTAAAA CTGATGCAGA GAACTGAACC GTTAGTAACA GTAGATAACG AGGAAGCATT 8520 

CTTTAAGTTA GCAAAAGCAG CATTTGCACA AAGAAGAAAG ACAATTAACA AT AACTAT CA 8580 

25 

AAATTATTTT AAAGATGGTA AACAACACAA AGAAGTGATT TTACAATGGT TGGAACAAGC 864 0 

AGGTATTGAT CCAAGACGTC GCGGTGAAAC GCTATCTATT CAAGATTTTG CTAAATTGTA 8700 

TGAAGAAAAG AAAAAATTCC CTCAATTAGA AAATTAAATG ATTGACAAAG CAAAGCACTA 8760 

30 

TTGTTAAAAT TTAAATTTTG TTTGACGAAA ACGTTGCAAA TATGGTATTA TGTAACTTGT 8820 

AGCGAGGTGG AGCAATATGC CAAAATCAAT TTTGGACATC AAAAATTCTA TTGATTGTCA 8880 

35 TGTAGGAAAT CGTATTGTAC TGAAaGCCAA TGGAGGCCGT AAGAaAACAA TAAAACGTTC 8940 

TGGAATTTTA AAAGAAACAT ATCCGTCAGT TTTCATTGTT GAGTTAGATC AAGACAAACA 9000 

CAACjTTGAG AGAGTATCTT ATACATACAC TGATGTGTTA ACTGaAAATG TTCAAGTTTC 9060 

40 ATTTGAAGAG GATAATCATC ACGAATCAAT TGCACACTAA ATAAGACATA TAGAGATGTT 9120 

AGACGTTTCT TAGTATAAGA AGTAAATATT ATGATAATTA TTTGAGTGTT GGGcATTATG 9180 

TTCAATACTC TTTTTATTTA CAAAATGTTT AACACTGATG TTTCGCTTAT AGATTTTTCA 9240 

45 

GTAAATGGAT AATTGTA1TT ATAAACACAA ATACAAGTAA ATACTAAGTA ATTAGATGGA 9300 

GAAAATTACT TTTTTATTAA AAAAACACTA AAAAACAAAT TAAAATGTCA AATATTAATT 93 60 

CTCTTTATGT TAAAATCATC ATATTAAGAT AACGAAAAGA GGGCGGAAAA TGATATATGA 9420 

50 

AACGGCACCA GCCAAAATTA ATTTTACGCT CGATACACTT TTTAAAAGAA ATGATGGCTA 94 80 

TCATGAGATT GAAATGATAA TGACAACAGT TGATTTAAAT GATCGTTTAA CTTTTCATAA 9540 
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AAATCTCGCA TATCGTGCAG CGCAACTATT TATTGAGCAA TATCAACTAA AGCAAGGTGT 
AACAATTTCT ATCGATAAAG AAATACCTGT TTCTGCTGGC TTAGCTGGAG GTTCGGCTGA 
TGCAGCAGCA ACGTTAAGAG GATTGAATCG ACTTTTTGAT ATAGGGGCGA GTTTGGAAGA 
ATTGGCTCTA CTAGGCAGTA AAATCGGGAC AGATATTCCG TTTTGTATTT ATAATAAAAC 
TGCACTATGT ACTGGAAGAG GAGAGAAAAT CGAGTTTTTA AATAAACCAC CTTCAGCTTG 
GGTGATTCTT GCTAAACCAA ACTTAGGCAT ATCATCACCA GATATATTTA AGTTGATTAA 
TTTAGATAAG CGTTACGACG TACATACGAA AATGTGTTAT GAGGCCTTAG AAAATCGAGA 
TTATCAACAA TTATGTCAAA GTTTGTCTAA TCGATTAGAG CCAATTTCTG TTTCAAAACA 
CCCACAAATC GATAAATTAA AAAATAATAT GTTGAAAAGT GGTGCAGATG GTGCGTTAAT 
GAGTGGAAGC GGACCTACTG TGTATGGGCT AGCACGAAAA GAAAGCCAAG CAAAAAATAT 
TTATAATGCA GTTAACGGTT GTTGTAATGA AGTGTACTTA GTTAGACTAT TAGGATAGAA 
GGGTTGAAAA GATGAGATAT AAACGAAGCG AGAGAATTGT TTTTATGACG CAATATTTGA 
TG 

(2) INFORMATION FOR SEQ ID NO: 99: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5614 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10322 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

GATTGATTAA ATGTTTTAAT CCACTTCAAT GCCTTCGATA AACTCTACAA TCGCGCTATT 60 

CATATAATTA TTCGATTTCA TTTGTTCAGC ATATGTCTCA TTAAATCCAG ACATAACTTT 120 

TTTAAAwGCG AAAATTGAAA TTGGTATCGT TACTAATAAG GCACTAGCCA TACGCCAATC 180 

AATGAGCATT ATGTATAAAA AGATAGCAGC TGACAAAAGT AAGTTTCCTA TAACTTCAGG 24 0 

AATCATATGT GCTAAAGGTA ATTCTATTGT TTCAACCTTA TCGACAAATA TATTTTTTAA 300 

TTCACCTATT TTCTTAGATT CCaCTACGCC TAAAGGGAGA CGCATTAATT TTTGAGCTAA 360 

TTTTTTACGA ATTTCAGATA AAATTTCATA TGCCGTAATA TGTGATAGCA TCGTTGACGC 420 

TCCAAAACAA CACACTTGTG AAATATAAGC GATTAAAGCA ATAAAGATAT AAACCATAAT 4 80 

CGAATTAATC GTATATGTAT TGTTAATCAT CATTAAAATA ATTTTAAATA CTGCCCAATA 540 

AGGAACTAAT CCAGAAAAGA CACTGATGAT AGACAACAAA ATTGATAACA TAATTTTCCA 600 
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ATATGTAACT CCTkTCAATT AATAATCTAA ATTAAGCCGC TTATATTATT 


TATTTCACTG 


720 


GATGATATAC 


ATAATATAAA 


x x J. o x inx x x 


\J 1 IHMnHnl 1 


AA1AL1 1A1 i 


ACAAGTACAT 


780 


CATATATTAG 


TTGATAACGA 


TTATCAATGT 


CGCGTGGATT 


TGTGACACAT 


TTCTTTTAAA 


840 


AATTCACAAG 


GTTATGGGGC 


AGAAATGATA 


AAGAGCCACT 


AATGATTTAT 


TATGTAGTGG 


900 


TTC*TGGf5Af!T 


GGGACAGAAA 


TGATATTTTC 


ACAAAATTTA 


TTTCGTCGTC 


CCACCCCAAC 


960 




TCTAGAAATT 


GGGAATCCAA 


TTTCTCTTTG 


TTGGGTCCCT 


GAATATAGCC 


1020 


TTGTAGAGTC 


TAGTACATTG 


ATTTGTATCC 


CAATGTCCCT 


ATAATTGATT 


ATTCGCTTTA 


1080 


TCTAATGATC 


CTATGACTCA ACTATTAAAT 


CATTTTTCGA AATACTTAAT 


TCTAATATAA 


1140 


TTAAATTCAT 


TTATTGTAAT 


ATTGCAAAAA 


TACATTGCAC 


ACCTTGTTCA 


TCAATGCTAT 


1200 


AATTAATTAC 


ATAATAAATT 


GAACATCTAA 


ATACACCAAA 


TCCCCTCACT 


ACTGCCATAG 


1260 


TGAGGGGATT 


TATTTAGGTG 


TTGGTTATTT 


GTCACCTTTT 


TTATTGTTGC 


GCGTTCGTAA 


1320 


CCAATGTGCA AAAAACGCAA 


CAAGAGAGCC 


GCTTATAGCT 


GAAGTCATGA 


TGTTAATTAA 


1380 


TAAATTGAAC 


ATCCGTCATA 


CACCTCCTCT 


CTGCGTTAAA 


GTAACGCCCG 


AGATGTTAGG 


1440 


CGACCATCAT 


ATTATATCAT 


TTATTTATTA 


TATTTCACGC 


AATATTAAGG 


CTTAAGTAAA 


1500 




GTGGTTTACG 


CTAC TTTAAT 


TGCTATCTTT 


TAAAATCCAT 


TTAGATAATA 


1560 


TAAATGTGAT 


GGGTATCGTA 


A 1 AA 1 I AAAC 


CAGCAAATGG 


TGCAATTTCT 


GCTGGCAAAT 


1620 


TTAGCCAGGA 


TACAAATACA 


TATAATAAAA 


CTGTTTGTAA 


GCTTACGTTG 


ACAATCTGCG 


1680 


TAATTGGAAA 


ACTAATGAAT 


TTTCTCCAAG 


TAGGTTTTAC 


CCTGTAAACA 


AAATAACAAT 


1740 


TCAAATAATA 


TGAAATCACA 


AAAGCGACTA 


GAAATCCGGT 


AATATGACTA 


ATCATATATT 


1800 


CAATGTGTAA 


TAATTTTAAC 


AGCAATAAAT 


AGACAACATA ATAATTTAAC 


GTATTAATGC 


1860 


CGCCAACAAT 


GATAAATTTT 


AAAATTTCAG 


CATGCGTTTG 


TGTTAGTTTC 


ATATGTGTAC 


1920 


TCCTCAACAT 


CAAAATATAT 


GCATAACTAC 


GTTCTCGAAC 


ATACTCGAAT 


ATGCGAGCCA 


1980 


ATCCGCTTCA 


CTTCAAATAT 


GCTTATTTCA 


ATCTTTATAC 


CCTTTCACAG 


CAAATTTAGT 


2040 


CTCTTTCCCC 


TCATCCTTAT 


ACGCCATTAT 


AATGTAACTG 


ATTTATCGCG 


TGACTCATTA 


2100 


GCACTATAGA 


GATTACTTTA 


GTTCACTAGT 


AATTTTATAT 


ACAATAAGAG 


CGACAACAGT 


2160 


AATGAGAGGA 


TGTCTACTAT 


GCAATTACAA 


AAAATTGTCA 


TCGCTCCTGA 


CTCATTTAAG 


2220 


GAAAGTATGA 


CCGCACAGCA 


AGTTGGCAAT 


ATTATAAAAC 


AGGCTTTTAC 


TAATGTTTAT 


2280 


GGGAATACCC 


TTCATTATGA 


TATCATTCCG 


ATGGCTGATG 


GTGGTGAAGG 


TACCACAGAT 


2340 


GCTTTAATGC 


ATGCAACAGG 


TGCCACTAAG 


TATACAGTCA 


TCGTTAATGA 


CCCTTTAATG 


2400 
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GCGGCAGCGT CAGGTTTGGA TTTATTAGAA AAAGAGGAAC GTAATCCTTT ATACACATCA 2520 

TCATATGGTA CCGGTGAACT AATTAAAGAT GCATTAAATC ATGGTGCTAA GACCATTATT 2580 

TTAGGGATTG GTGGCAGTGC AACAAATGAT GGTGGTACAG GTATGCTAAG TGCACTAGGC 264 0 

GTAAAGTTTA CTGATGTAAA CGGGGACTTA TTACAAATGA ATGGTGCTAA TCTTGCTCAC 2700 

ATTGCACAAA TCGATATAAC CAATCTAGAT TCGCGATTAA AAGAGGTGAC CTTTAAAGTG 2760 

GCCTGTGATG TTTCAAATCC TTTATTGGGT GAAAATGGTG CTACCTATAT TTATGGTCCT 2820 

CAAAAAGGCG CTGATGCAAA GATGATACCA AAGTTGGATT TCGCAATGTC GCATTATCAT 2880 

GATAAGATAA AAATGTGCAC AGGAAAGTCC GTTAATCAAA TACCAGGTTC TGGTGCAGCT 294 0 

GGCGGTATGG GCGCAGCATT ATTAGCGTTT TGTGAGACAA CTTTAACAAA AGGTATTGAT 3000 

GTCGTCTTTG ACATTACAGA TTTTCATCAA AGAATTAAAG ATGCAGACCT CGTTATTACT 3060 

GGAGAAGGAC GCATGGATTA TCAGACCATC TTTGGTAAAA CACCCGTAGG CGTTGCGTTA 3120 

GCTGCAAAAC AATATCATAT TCCTGTCATC GCGATTTGTG GCAGTCTAGG CGAAAATTAT 3180 

CAACATGTTT ACGATTTCGG TATTGATAGT GCCTATTCTA TAATCTCTTC ACCTAGCACT 324 0 

25 TTAGAAGATG TCCTACAAAA TAGCGAACAA AATTTATTAA ACACTGCAAC TGACATTGCT 3300 

CGTATTCTGA AATTACAATA ATGTCAAAGT AAATCATCAG CTTTATTATT TGCAGTTAAA 3360 

ACTTGAATGA GGTGAAACCC ATGAAAAGAA CTGATAAATA CCGTGATTCA TATCAATACG 3420 

ACAATCAAAA CCAAAATCAT CGTCGTCAAT CTGAAGACGC ATCGTATAGA CAACAATATG 34 80 
CTAAAGGCGA TCCTGAAGAA CACCCGGAAC GATACTATAA TGGTAGAGAT TATCGAAGAG ' 3 54 0 

AACAAATTCT TGAAGAAGAA AACGAGAAAT CCCGCCGTTC AAAAAAATGG TTATATATCA 3600 

TTATTGCCAT TCTCTTAATT ATTGTCGCTA TTTTTGTCAC ACGCGCCTTA CTTAACAATG 3660 

ATAGCGATAA AGTTAGTAAT GACCCTAAAG TCTCTCAAAA TTATAAAAAA CAAGTTGAAA 3720 

ATCAAGACGG CCAAATTAAC CAGCAAGTAG ATAATGCTAA AGAAAATATT AAAAACAACC 3780 

AAAAAACTGA TGACATTATT AAAAATTTAC AAAATCAAAT CGACAACTTG AAGCAGCAAG 3840 

AACAAAACAA AGCTGATTCT AAGCTAACTC AATTTTATCA AGACCAAATC AACAAATTGA 3900 

45 CAGAGGCAAA TAATGCACTT AAAAACAATG CAAGCCAAGG TAAAATTGAA AGCATGTTAA 3960 

ATGATATTAA TACAAAATTC GACAGTATTA AATCTAAATT AGAAAGCTTA TTTAAAGATG 4020 

ACAATGGTGG CGCTAATTAA TTATTACACC TGCTTTGATG ATAAACATTA ATTCCCTATA 4 080 

SO 

CTTTATCTGT ATCACTACGT TATTCGTGAT GATGCATTAA GAGTATAGGG ATTTTTTATA 4140 

TAAACTTGTA TTCTAACTAC ATACAAATAC ACACAAAACG TATATAATTT ATATAATTAT 4 200 
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TTATTGCTAA TTACGTTAGG CGTCATGACC GCTTTTGGCC CACTAACTAT AGATATGTAC 4320 

GTACCATCAT TACCTAAAGT GCAAGGTGAT TTTGGTTCTA CTACATCAGA AATTCAATTA 4380 

5 ACATTATCAT TCACAATGAT TGGTCTTGCA CTAGGCCAAT TTATCTTTGG ACCTTTATCC 4440 

GATGCTTTTG GTCGCAAACG GATTGCTGTA TCCATTTTGA TCATTTTCAT TTTGGTATCA 4500 

GGTTTGTCTA TGTTTGTTGA TCAATTGCCA TTATTCTTAA CTTTACGATT TATTCAAGGT 4560 

10 

TTAACTGGTG GTGGCGTCAT CGTGATTGCA AAAGCCTCTG CTGGTGATAA ATTTAGTGGC 4620 

AACGCACTCG CTAAATTTTT AGCATCTTTA ATGGTAGTTA ATGGCATCAT CACTATTCTT .4680 

GCACCATTAG CCGGTGGATT AGCTTTATCC GTAGCAACAT GGCGTTCTAT TTTCACAATT 4740 

15 

. TTAACTATTG TGGCACTCAT CATTTTAATT GGCGTCGCTT CTCAATTACC TAAAACATCT 4800 

AAAGATGAAT TAAAGCAGGT GAATTTTAGT AGCGTCATTA AAGATTTTGG AAGTCTTTTG 4860 

20 AAAAAACCAG CATTTATTAT TCCAATGCTA TTACAAGGwT TAACTTATGT AATGCTATTT 4920 

AGTTATTCAT CTGCATCGCC ATTTATTACT CAAAAATTGT ATAATATGAC ACCCCAACAA 4980 

TTTAGTATCA TGTTTGCTGT TAACGGTGTA GGTTTAATCA TTGTCAGTCA AGTCGTTGCT 5040 

25 TTATTAGTAG AAAAATTACA TCGCCACATA TTATTAATCA TTTTAACTAT TATACAAGTG 5100 

GTAGGTGTTG CTTTAATTAT CCTGACACTT ACATTCCATT TACCACTTTG GGTCTTACTC 5160 

ATCGCATTCT TCTTAAATGT GTGTCCTGTG ACGTCAATTG GACCGCTTGG TTTCACAATG 5220 

30 

GCTATGGAAG AACGAACAGG TGGCAGTGGT AACGCATCAA GTTTACTTGG CTTATTCCAA 5280 

TTTATCTTAG GTGGCGCTGT TGCACCATTA GTTGGCTTAA AAGGCGAATT TAATACATCA 5340 

CCATATATGA TTATTATCTT CATTACAGCC ATTCTATTAG TCAGTCTACA AATCATTTAC 5400 

35 

TTTAAAATGA TTAAAAAGCA ACATGTCGCA TAACACTTCA ACATAATTAG AACCCTAGCA 54 60 

AAGATATCTA TCTTTGTCAG GGTTCTTCTT TATGAATTAT GAGATCGAAT CTTCAACTAA 5520 

AATTACGCCT TCATAGCAAG GACATTTCTA TTCAATCACC CTTTAACAGG CATCCAAATT 5580 

40 

TcTGTAATAT ATTTTTCACT TGTAGTATCA CCAT 5614 
(2) INFORMATION FOR SEQ ID NO: 100: 

45 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9179 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

so 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
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50 



AAAGACAATG ATATGAAGTA TATGGATATC ACAGAaAAAG TGCCAATGTC GGAATCTGAA 
GTTAACCAAT TGCTAAAAGG TAAGGGGATT TTAGAAAATC GAGGGAAAGT TTTTCTAGAA 
GCTCAAGAAA AATATGAGGT TAATGTCATT TATCTTGTTA GCCATGCATT AGTAGAAACA 
GGTAACGGCA AATCAGAATT AGCAAAAGGC ATTAAAGATG GGAAAAAACG CTATTACAAC 
TTTTTTGGTA TAGGAGCATT CGATAGTAGT GCTGTTCGTA GTGGGAAAAG TTATGCTGAA 
AAGGAACAAT GGACATCACC AGATAAGGCG ATTATTGGTG GTGCAAAGTT CATTCGTAAT 
GAATATTTTG AAAACAATCA ACTGAATTTA TATCAAATGC GATGGAATCC AGAAAATCCT 
GCGCAACATC AATATGCGAG TGACATTCGC TGGGCAGATA AAATTGCCAA ATTAATGGAT 
AAATCCTATA AGCAGTTTGG TATAAAGAAA GATGATATTA GACAAACATA TTATAAATAA 
GACATCGGTG CTTAAAGGAG CTGGAACAAT TTATTGTTTC GAGCTCCTTT AGCGCATTCT 
GAGTGTGTTA GTTAAATGGA TTTTAACCTA ACAAAAAACG CTATATAGCA TCAAATATGC 
TATATCCCAC ATCATTGTTA CAAATGTACA TGATGTAAAT GAATATTGCT GTCTAAATGT 
GCATGTAATA TACAATGGTG CAGATAATAC ACTTAAGTCC TTAAAAATGA AACGTTAgTT 
CCAAGAGTCA TTTTTAAACA ATAGTGCATG TGATAAAATA GAAAAGAATG AAAAATATAG 
AGGTGACAAT ATGAAGATAG CAATTATAGG TGCAGGCATC GGTGGATTAA CAGCTGCTGC 
ATTATTACAA GAACAAGGTC ATACTATTAA AGTCTTTGAA AAAAATGAGT CAGTTAAAGA 
AATTGGCGCT GGGATTGGTA TCGGAGATAA TGTGCTTAAA AAACTAGGTA ATCATGACTT 
AGCTAAAGGT ATTAAAAATG CTGGGCAAAT CTTATCTACA ATGACAGTGT TAGATGACAA 
AGATCGCCTG TTAACTACTG TTAAATTAAA AAGTAATACA TTGAATGTGA CGTTACCACG 
CCAAACATTA ATTGACATTA TTAAATCTTA TGTAAAAGAT GACGCAATAT TTACAAATCA 
TGAAGTCACG CATATAGATA ATGAGACAGA TAAAGTTACC ATACATTTCG CGGAACAAGA 
AAGTGAAGCA TTTGATTTAT GTATTGGTGC TGATGGAATT CATTCTAAAG TGAGACAATC 
TGTAAATGCT GACAGTAAAG TATTATATCA AGGGTATACA TGCTTTAGAG GTTTAATTGA 
TGATATTGAT TTAAAGCATC CGGaTTGTGC AAAAGAATAC TGGGGaAGAA AAGGaAGAGT 
AGGTATTGTT CCGTTATTAA ATAATCAAGC ATATTGGTTC ATTACAATTA ACTCGAAGGA 
AAACAATCAT AAATATAGTT CGTTTGGTAA ACCTCATTTG CAAGCATACT TTAATCACTA 
TCCAAATGAA GTTAGAGAGA TCTTAGACAA ACAAAGTGAA ACAGGTATCT TATTGCATAA 
TATTTATGAT TTGAAACCAC TCAAATCTTT TGTTTATGGT CGTACTATTT TACTAGGAGA 
TGCAGCACAT GCGACAACGC CTAATATGGG GCAAGGTGCT GGACAAGCAA TGGAAGATGC 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
■ 1440 
1500 
1560 
1620 
1680 
1740 
1800 
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TAAAATACGT GTCAAACATA CTGCAAAAGT AATTAAGCGT TCTAGAAAAA TCGGTAAAAT 1920 

TGCCCAATAT CGTAGTCGTT TATTTGTTGC AGTTAGAAAT CGTATTATGA AAATGATGCC 1980 

5 

AAATGCATTA GCAGCTGGAC AAACTAAATT CTTATATAAA TCGAAAGAAA AATAATACAA 2040 

CAATATGAAA ACCCCCGTAT GTTGAAACGA GAGCTCAACA TATGGGGGTT CTTGTTTTTA 2100 

TAATGTTATT ATAATAAATT CAATTATTAG TTAACGACAA ATTGTGGTTT CTCACCTTGA 2160 

10 

ACGGCACTAA TTGCAGCATT AGCAACAATT TTAGACATCA TGTCACGTGC TTCAAATGTA 2220 

GCATTACCAA TATGCGGTGT TAATACTACA TTATTAAGTG ATTTTAAGTC ATCGGTAATA 2280 

15 TCTGGTTCAA ATTCATATAC ATCAAGTGCA GCACCTTCAA TTTCATTATC TTTCAATGCT 2340 

TGCACTAGTG CTTGTTCGTG CACGATTGGA CCACGAGAGG CATTGATTAA ATACGCCGTA 2400 

GATTTCATCA TTTTAAATTG TTCTGTATCA ATTAAATGAT GCATTTTAGG ATTATAAGCA 2460 

20 GCGTTGATAG TGATAAAATC TGCATTCTTT AATAGTGTAT CTAAATCTAC ATATTTTGCA 2520 

CCGATTTCTC GTTCTTTTTC TTCTTTGCGA TTAGGTCCAG TGTATAGCAC ATCCATGTCA 2580 

AATGCTCTTG CACGACGAGC TACTGCACTA CCAATTTCAC CTAAACCGAT AATGCCGATT 2640 

25 GTTTTCCCAG ATACTTCTCT ACCTCTGAAA AATAAAGGTG CCCATCCATC AAATCCAGTT 2700 

GTACGTGATA ATTGGTCCCC TTCAACAATA CGACGCGCTA CTGCAAGTAC TAATCCAATT 2760 

GTTAAATCAG CAGTCGCGTT TGTTGATGCT TTAGGTGTGT TTGTAACATC TATACTTTTT 2820 

30 

TCTCGGGCAT ACTCGATATC AATATTATTA AAACCAGCGC CATAGTTGGC AATGATTTTT 2880 

AAGTCTTTAC CAGCATCGAT AACATCTTTA TCAACGTTTG TAGATAATAA ACTAATTAAG 2940 

35 GCAGTCGCGT TTTTAACACC TTTAATTAAA GTGTCTTTAT CGACTAATCC TTTACCTTCA 3000 

TACATTTCAA CTTCAAAATG TTCTTGTAAA AGTTTTAAAC CTACTTCTGG TATtGCACCA 3060 

gCAAGATAAm CTTTTtCCAT AAAAGAtCAC TCCTTTTATC TTAGTATAGT AGAAGATTAG 3120 

40 ACAGTATACA ACTATGTCAT GATGTCTTGT GTATCAATGA TGTAAGCGCG TACTTTTGAT 3180 

GGAGGCGATA TAACTTAGGC ACTGTAGAAC TATGAATATT GTAATGTGGA AAAACTGGAT 324 0 

CAATTAAATT AGATAACGTA GTTTTAAAGT TAATAGTATT AGAAAAAATT AATATTTTGA 3300 

45 ATATGGGAGG AAATATAAAT AAGTAGGTGG CAACGAAAAA TAGCAAAAAA AGAGCTTCTC 3360 

CTATAAAGGA AAGCTCAAAG TTTTTTGATG ACATATGTAC TAGAATTAAG TTTCAAGACA 3420 

ATATGTATCA TCGTGTTTAT ATTAAATATG GATGTAGTTG TAGTTACCTG CTTCACTTGC 34 80 

SO 

AGAAATAGTT CTAGAACTTA CTGAGAAAGG TCCGCCACTA TAATTCATTT CTGAAATTGT 354 0 

AACTGAACCA TCACTGTTTA CACTTTCTAC ATATGCAACG TGACCAAATG GTCCTTCAGA 3600 
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AGCAGCAGCC CAATTATTAG CATTTCCCCA 
ATATACATAC CAAGTACATT GTCCTGCAGT 

5 

TGTAGTTGTC GTAGTTGTCG TAGTCGTTGT 
GTAATTTGTA TAATTTTCAG CAGCATCTGC 
GATTCCTGCT GTTAACGTAG TTGCTGTTAC 

10 

TTCTATATCT TTTTTTATAA ATAAAACGTA 
ATGACAATAG TTACTTTAAC AAAATtAATG 

15 AAGAATAAAA AAACTTTGAC TAATTTTGTA 

AACAGATAAT TAATAGGAAA TATTTATTTG 
GGTATTATAT ATTCTTGGCC ATTATAATAT 

20 GATAATATTG AGAAAGCGAA TATGGATAAA 

ATAATAATGA AATCAATATC TGTAGCAATT 
AAAACGATAG ACCAAATAAT ATAAGAAATC 

25 TCAACTAGTT TCGATTCATC TTTTTTCAAT 

GTGAATAAAC TTAATAAATA GATAAGCATC 
TTCGGTTGAT GATTTGTTAC GTCGTTCATT 

30 

ATTGTAATAT TATCTTTAAC TATAACAAAA 
AATTATTAAA AATAAAAATA ATTGGTGGAC 

ATATATACTT AACATTTATA ATGATGCGTA 

35 ' " . 

CGTATAATTT GTTTTTAATT TTAACCAAAG 
ATTGTAGGAT CAGGAAATGG CGCAGTTACG 

40 GATGTTAAAT TATATTGTCG TAATCAATCT 

GGCGGATTTG ATTTTAATAA TGAAGGTGAT 
GATGATATGG AATATGTTTT AAAAGATGCT 

45 TACATAGAGT ATTATGCTGA TGTAATGGCA 

TTCAACATGG CTGCAGCAAT GGGGTCAATT 

ATTGAAACAA AACCACAACT AGCGGAAgcT 

50 

TTTGAAAATG CAGCAGTTGA TTTATCTCTA 

GATAGAAGCT GTCTAAATGA TTGTTATGAC 

55 



AGTAGAACCG 


ATTTCTCCGC 


CAACTTTATC 


3720 


GTATAAGTTA 


CCAGAATGTG 


AAATTGATGA 


3780 


AGTTTGAGTC 


GTGTTGTAGT 


TATAGTTGTT 


3840 


ATGATGTGCT 


TGACCTACTA ATGCTGTGCC 


3900 


TAATTTTTTC 


ATGAATAAAG 


TCCTCCAAAG 


3960 


GCGACTGTTT 


TATTCTCACA 


TCTCGAATTG 


4020 


CTTCTTGTGG 


GGAATGTTAT 


TGATTTGTAA 


4080 


ATAAAAATTA 


GTCAAAGTTA 


CAATGAGATT 


4140 


TAATATGTTT 


AAATAAATCG 


AATTGTTAAA 


4200 


TTGACACACG 


CAATAATTGT 


GAATACAAAA 


4260 


ATACCGATAA 


ACGTAATGAT 


GAAACCTATA 


4320 


AGGAAAACGC 


CTATTAAAGT 


GATAACGACT 


4380 


GTATAGTTAA 


GATAATTTTT 


TCCAGCACGA 


4440 


AACCATATTA 


TCAGTGGACC 


AATAATAGAT 


4500 


GCCATAATGT 


TCTCATCATT 


GGATTTGCGA 


4560 


TCAGTTGTCA 


TATTAGACAC 


TCCTTTGAAA 


4620 


TATAATCAAA 


AATAAACATG 


TTTATTAAAC 


4680 


GTCGGCGTTT 


AAATAGGTTA 


ATTTAAGGTT 


4740 


ATGAATTCGC 


ATCATTTTTA 


TATTGTCTTA 


4800 


ATAGAAAGAG 


GGTTGTTTAT 


GAAAATAGCA 


4860 


GCAGCAGTAG 


ATATGGTGAG 


CAAAGGCCAC 


4920 


ATAAGTAAGT 


TTCAAAACGC 


AATCGAAAAG 


4980 


GAACGTTTCG 


TAAAATTCAC 


TGATATTAGT 


5040 


GAAATTGTTC 


AAGTGATTAT 


TCCATCTTCA 


5100 


GAGCATGTAA 


CTGATAATCA 


GTTGATATTC 


5160 


CGTTTTATGA 


ATGTTTTAGA 


AGATAGACAT 


5220 


AATACGTTGA 


CGTATGGTAC 


GCGTGTCGAT 


5280 


AATGTACGTC 


GTATCTTCTT 


TTCAACATAT 


5340 


AAAGTTTCAA 


GTATTTATGA 


TCATTTAGTA 


5400 
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CCAACATTAT TGAATGTCGG TCGCATTGAT TATGCTGGCG AGTTCGCTTT ATATAAAGAA 5520 

GGAATTACTA AACATACAGT TAGATTACTT CATGCAATCG AATTAGAACG TTTGAATTTA 5580 

GGCCGTAGAT TAGGTTTTGA ATTATCAACA GCTAAAGAAT CACGTATTGA ACGTGGTTAT S64 0 

TTAGAACGTG ATAAAGAAGA TGAACCATTA AATCGTTTGT TTAATACAAG CCCAGTATTT 5700 

TCACAAATTC CAGGACCAAA TCATGTAGAA AGCAGATATT TAACTGAAGA TATTGCATAT 5760 

GGTTTAGTAC TATGGTCAAG CTTAGGTCGT GTTATTGATG TACCGACACC AAATATAGAT 5820 

GCAGTAATTG TAATTGCATC AACCATTTTA GAGAGAGACT TCTTTGAGGA AGGCTTAACA 5860 

GTTGAAGAAA TTGGTTTAGA TAAGCTTGAT TTAGAAAAAT ATTTAAAATA AATGATGGCT 5940 

TGAAGATAGA AAAGGATATA GCATTATGCA AAAGCAATAA ATTGAAGAAA AGAGGTTTCT 6000 

CATCAATAAG CGnAGGGGAC GATAGATGAT GAAAAGAAAA CCCACCTTTT TAGAATCAAT 6060 

20 TTCGACAATG ATTGTAATGG TTATTGTTGT TGTAACAGGC TTTGTGTTTT TTGATATTCC 6120 

AATTCAAGTA TTATTAATTA TTGCCTCAGC ATATGCCACA TGGATTGCAA AACGTGTAGG 6180 

CTTAACATGG CAAGATTTAG AAAAAGGCAT TGCAGAACGT TTAAATACTG CAATGCCTGC 6240 

25 AATTTTAATT ATACTAGCGG TAGGAATTAT AGTAGGCAGT TGGATGTTTT CTGGCACAGT 6300 

GCCAGCCTTG ATTTATTATG GCTTAGATTT ATTGAATCCA AGCTATTTTT TAATATCAGC 6360 

CTTTTTTATA AGTGCTGTTA CATCTGTAGC AACTGGTACA GCATGGGGCT CTGCATCAAC 6420 

TGCAGGGATT GCACTTATTT CTATTGGTAA TCAATTGGGG ATTCCTCCAG GGATGGCAGC 64 80 

GGGTGCTATT ATAGCAGGGG CTGTGTTTGG CGATAAAATG TCACCATTAT CAGATACAAC 6540 

TAATTTAGCG GCGCTTGTTA CTAAAGTTAA TATATTTAAA CATATACATT CGATGATGTG 6600 

GACGACGATA CCTGCATCAA TCATAGGTTT ATTAGTATGG TTTATTGCTG GATTTCAATT 6660 

TAAAGGGCAT TCAAATGATA AACAGATTCA AACTTTGTTA TCAGAGCTTG CACAGATTTA 6720 

TCAAATTAAC ATATGGGTCT GGGTTCCCTT AATTGTGATC ATTGTTTGTT TGCTATTTAA 6780 

AATGGCTACA GTGCCAGCTA TGCTAATATC AAGCTTTTCT GCCATTATAG TGGGGACTTT 6840 

TAATCATCAT TTCAAAATGA CAGATGGTTT CAAAGCAACA TTTAGTGGTT TTAACGAATC 6900 

45 AATGATACAT CAGTCTCATA TTTCATCCAG TGTGAAAAGC TTGTTAGAAC AGGGTGGTAT 6960 

GATGAGTATG ACCCAAATAT TAGTAACGAT ATTTTGCGGA TATGCATTTG CAGGTATTGT 7020 

AGAAAAAGCA GGATGTTTAG AAGTCTTATT AACTACTATT TCTAAAGGCA TCCATTCTGT 7080 

50 

AGGAAGTTTA ATATGTATTA CTGTTATTTG TTGTATTGCG CTTGTATTCG CTGCAGGTGT 7140 

TGCTTCGATT GTAATTATTA TGGTCGGTGT GTTAATGAAA GATTTGTTCG AAAAATACCA 7200 
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AATACCATGG GGAACATCAG GTATTTACTA TACGAATCAA CTTCATGTCT 


CTGTTGAAGA 


7320 




ATTTTTCATA TGGACAGTAC CATGTTATTT ATGCGCAATT ATAGCAATTA 


TCTATGGTTT 


7380 


5 


TACAGGGATA GGTATTAAAA AGTCATCGAA TTCACGTTTA ACTTAATGTG 


AGCGTGGAAT 


7440 




ATATATAATA TGTTGAAACA CTTTAATCAT TTATAATTGT AGCGGTTATA ATTTGAAAAG 


7500 


10 


GTTTTAACTT AGAATAAATA TCCTCTATGC ATATACTGAA TATGTTTTGT 


AGCGGAACAT 


7560 


GTTGATATAT GTAATGTAAG TTTTATGTCA TGATTTGTAA TGACTAAATT 


AATTGAGAAT 


7620 




TTGAAGGCAA GTATATTTGT AAGTACTTTA ACTAAAAATT TATCAATGTA 


TAGCCGATTT 


7680 


15 


GACATGCCTA AATTTGGGTG TGTCAATGGC TX3TATGTTGT TTATTCTTTA 


TTACAGAGTG 


7740 




AATCGGATTG GTGAAAATCG AAATTTTGAG ATTTTTACCA ATTCGATTTT 


TTTCATAGAA 


7800 




ATTAAAAAAG CCAACAAGGC TCTTGAAACC TTGTTGGCGT AAA CAT AG CC ATCACTAATT 


7860 


20 


AGTGAATGAA GTTATAACCA GCAGCTTGGC TAGCTGAGAT TGTACGTGAA 


GTTACAACAC 


7920 




CTGGGCCATA ACCATAGTTC ATTTCTGAAA CTCTTACTGA ACCATTGCTG 


TTAACACTTT 


7980 




CAACGTATGC AACGTGACCG TATGCACCTT GAGTTGTTTG CATAATTGCA 


CCAGCTTTTG 


8040 


25 


GTGTATTGTT CACTGTGTAA CCAGCTCTTG CAGCTGCGTT AGCCCAGTTA 


CTTGCATTGC 


8100 




CCCAAGTTGA ACCGATTTTA CCACCTACAC GATCAAATAC GTAGTATGTA 


CATTGACCAG 


8160 


30 


AAGTGTATAA GTTACGTCCT GAAGTATAAC CACTTGAGAT TGAACGGCCA 


TTTGATGATG 


8220 


GAGCCATAGT TGTAGTTACT TGAACATTGT TGCTTGAAGT GCTGTAGCTT 


GCACCTAAAC 


8280 




CACCAGTACG GTAGCTGTTT GTGTTGTAAC TATTATAGTT ATTGTAGTTA 


TATGATTGAT 


8340 


35 


TATTATTTGA GTAGTTGTTG TAACGGCTGT AGTTATTGTA GCTATAACCG 


TTGTTGTAAT 


8400 




TGTTATAGTT ATTGTAACCA TTGTAGTAGT AATAGCTGTA GTAGCCATTA 


TCTTGGTTTA 


8460 




ATTGACTTGG ATGCCAGTTA CCTTTCCATG TGTAATGGTA GTTACCTTGT 


GCATCAATAG 


8520 


40 


TGTAAGTATA GCTATATGAT GTTGGGTCGT TTGGATTATA ACCGTAGTTA 


TCTTGCTCAG 


8580 




AAGCATGAGC TTGATTTCCT GATGCAATTG CGATTGTAGC GAATCCTGCA 


GTTGCGATAG 


8640 




TAGCTGTAGC GATTTTCTTC ATTTTAAAAA TATCCTCCTA AAAATTTTAA ATCTAAAATA 


8700 


45 


TTTTCGTAAT GTCCGTGTGA CAAAATTAAT GTTATAAGTT ATCTCTCGTA ATTAAACGAC 


8760 




AAGAAAGACT ATAACAGAAA f TAGCGTCCT TGTGTGCTTT GTTAACGTTT 


TGTAATTTTT 


8820 


50 


TGCTAATATC TTGACACAAT AGAATTTTAA AAGTATAGAA ATTTGCATTT 


TGCAAAACTT 


8880 


ATAACTACGG CATTCTTTGT GAAAACTGAA TGTTTCGAAA ATAAGTCTGT 


TACAAATTTG 


8940 




TAATATTACT GAAAATTCTA AATGTATATT TTGTGCATAA TATAGGACTT 


TTAATCAGAA 


9000 
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GGATGAAAAT GTATATTTAA TGGATAAAAT ATCCTAATTT AGCATAAAAA AATGTTTTAA 9120 
TAAAAGTATT ATTTGATATA ATCGATTTAT GTTTTGTTAC TGCTAAAAAA CATGTGGCG 9179 
(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1868 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

15 





CCTTCAGCCA 


TTTGACTTCG 


ACATGAGTTG 


CCTGTACATA 


TAAAATAAAT 


TGTTTTTTTA 


60 




GTCATAACAA 


TCTCCTAATT 


AATTAAAATA 


TGATAAGTGT 


TAGATACAAC 


CCTATGAGGG 


120 


20 


TTATAAATAG 


TACTGGAATT 


GTAATGATGA 


TACCAGTTTT 


AAAGTATGTG 


CCCCAAGAAA 


180 




T CTTAACAT C 


TTTTTGtGTT 


AAGACGTGTA 


ACCACAGTAA 


TGTAGCTAAA 


GAGCCTATCG 


240 




GTGTAATTTT 


TGG AC CT AAA 


TCAGAACCGA 


TAACATTCGC 


ATAAATTAGG 


CCTTCTTTTA 


300 




ACATG CCATG 


GACATTTGAT 


TGACCAATAG 


CAATCGCATC 


TATTAAAACT 


GTAGGCATAT 


360 




1A1 1 LAI I Ai. 


TGATGATAAA 


AACGCTGAAA 


TGAAGCCCAT 


TCCCAAAATA 


GTGCTAAATA 


420 






wAAA I ATAl 


TCTAATATTT 


TAGCCAATAT 


TAAAGTAATG 


CCAGCATTTC 


480 


30 


TTAAGCCGAA 


TACGACGATA 


TACATACCAA 


TTGAAAATAA 


TACTATATTC 


CAAGGTGCGC 


540 




CCTTAATGAC 


TTGCTTAATA 


TTTACAGCAT 


TTGATTTACG 


AGCCAACATT 


AGAAAAATAA 


600 


35 


AAGCAATGAT 


TCCAGTGAAA 


ATTGATACCG 


GAATTTTAGT 


AAATTTACTG 


ATTAGATAGC 


660 


CGAAAAGTAA 


TATAACTAGA 


ACAATCCaTG 


AAATTTTAAA 


TAGCTTTAAA 


TCATTAATGG 


720 




CATCTTTAGG 


ATGCTTTATA 


TTATTATCAT 


CAAACGTTTT 


AGGTATCGCT 


TTTCTAAAAT 


780 


40 


ATAACCACAA 


TACTATAATA 


CTTGCTAAAA 


GCGAGAATAA 


ATTAGGTATA 


ATCATTCTAC 


840 




TAAAATATCG 


AACGAATCCT 


ACATGAAAAT AATCAGCAGA 


TATAATATTC 


ACTAGATTGC 


900 




TCACGATTAA 


AGGTAAAGAA 


GTTGTGTCAG 


CTATAAAACC 


ACTCGCAATA 


ATnAAAGGGA 


960 


45 


ATATGGCCCG 


CTTACTAAAA 


CCTATATTTT 


TAACCATCGC 


TAATACAATA 


GGCGTTAAGA 


1020 




TTAAcGTGCG 


CCATCATTTG 


CGAAAAATGC 


AGCAACAATG 


GCACCCAATA 


ATATGATATA 


1060 




AACGAACATT 


TTTAAACCAT 


TGCCTTTTGA 


AGCATGAAGC 


ATGTGAATAG 


CTGACCATTC 


1140 


50 


GAATAATCCA 


ACTTTATCTA 


ATATTAATGA 


AATAAGAATG 


ACTGAGACAA 


AAGTCAAAGT 


1200 




AGCATTCCAA 


ACAATACCTG 


TTACTTCGAA AACATCGGAA AAACTTACAA 


CACCAGTAAT 


1260 
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TAATACAAAT AATAAAGTTA CTAGAAAAAT GAGTGTCGCT AAAGTTGTCA TCATTAGCAT 1380 

TCACCAGTCT TAAGGTTATG ACAAATACAT CGTTGGTTAG AGGTATGAAC CTTAGACAAG 1440 

TTATTAATTA CGGACTCAAA AATATTATGA TTgAGCTGGT ATAAATGTTT ATTTCCGATT 1500 

TTTCGTGTCG TAACTAAGTT GGTTTTTACT AATGCTTTCA TATGrTAGCT AAGTGTAGGT 1560 

TGAGAGAATT GAAAATGTGC TAACAAATCA CAAGCGCATA ACTCTCCACA AGAAAGTAAA 1620 

TCTAGTATTT CTAATCTGCT TGAATCTGAT AAAACTTTTA AAAATGTTGC TAGTTCTTTA 1680 

TACGTCATAA CATACCTCCT AGACGTTAAA TAGATTATCA TCTATATAGA TGAATGTCTA 1740 

TGTTCCTTTG GTATATTACA CGATATGACT ATGTAATTTA AATTTGGTTT TAGTATTAAA 1800 

AGGGTATTAA AGATAAATTA . TAGATATTGA TTTTGCAAAA TATACTCTTT GTTCTGCATT 1860 

GAAAAAGG 1868 
(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) * LENGTH: 15249 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 
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ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

ATTTATGAAA TCCATAGCnA TAAACATTAT TCTTGCATCG GCTATACAAA CAGTTACCGC 60 

AAGCAAATTT GTATATCAAC CTGGAATTGT GTTCACGTCA ATGGCaAATG CCGATGATGT 120 

GTTATCAGGC GATAGTTATT TTATGGCTGA ATTAAAATCT ATTAAGCGTA TTGTTGAAAT 180 

TCCAGATAAT CAAAAAATAT ACTGCTTTAT AGATGAAATT TTTAAAGGTA CCAACACAAC 240 

TGAACGAATT GCCGCTTCAG AATCAGTACT ATCATTTTTA CATGAAAAAT CTAACTTTAG 300 

AGTTATTGCA GCAACACATG ATATTGAGTT AGCTGAACTC TTAAAACAAC GTTATGAAAA 360 

TTACCATTTC AATGAGGTAA TAGAAAATAA TAACATACAT TTTGATTACA AAATTAAGCC 420 

TGGCAAAGCA AATACACGTA ATGCCATCGA ATTATTAAAA ATCACTTCAT TTCCAGCAAA 480 

AATATATGAA CGAGCAAAAG ATAATGTCCC GAAAATTTAG CATTTAACTT TAAACATAAA 540 

AACGTCAGCT ATCACATGAC AGAAGACTAT GAACAGTTTC AATAATGTTC ATAGTAATCA 600 

TGTTAATAAC TGACGTTTAT TTTATTCTGC AGAATACTCT TCTAAATCTA TATTGCTGTG 660 

CCCATTTAAT GCTAAATCAG CAAATCGACC TTGCTGATAC AAATAGTGGC CGGCAACGCC 720 

TATCATTGCA GCATTATCTG TGCATAATTT AGGACTTGGG ATAGTTAATT GAATGTCATT 780 
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AACAATTAAT CGCTGAACAC CATATTCTTT ACAAGCTTGA ATAGCTTTAA ACGTGAGCAC 900 

CTCTACAACA CTGTTTTGAA AGCTCGTTGC TACGTTAGCT TCAATGATTG GaATATTTTT 960 

TTGTCGTTGA TTGTGAAGTT GATTGATTAC GGCACTTTTC AACCCACTAA AACTAAAATC 1020 

ATAACTATCT TTATCCAACC AAACACGAGG GAATGAATAA GTATCTTCAC CTTCAGCAGC 1080 

CAACCGATCA ACTTGTGGAC CACCTGGATA ATTTAAACCA ATTGTTCGTG CCACTTTATC 1140 

ATAAGCCTCA CCTACTGCGT CATCTCGTGT TTCACCAATG ACTTCAAATG ATAAATGATC 1200 

CTTCATATAA ACTAATTCAG TATGTCCACC TGAAACAATA AGTGCAATTA GCGGGAATGT 1260 

TAATGGCTCT TCTATGTGAT TAGCATATAT ATGTCCTGCA ATATGATGAA CAGGAATAAG 1320 

TGGCTnATCG TAAGCAAATG CCAATGCTTT GGCTGCATTA ACACCTATTA GTAACGCACC 1380 

AATTAGTCCA GGGCCTTCTG TAACCGCTAT GGCATCAATA TCTTCTATTG ATACATCGGC 144 0 

20 ATCCCCTAGA GCCTCGTTTA TTGTTGCTGT TATACCTTCA ACGTGATGTC TACTTGCCAC 1500 

TTCGGGAACG ACACCGCCAA ATCGTTTATG ACTTTCAATC TGACTTAAAA CTGTATTTGA 1560 

TAAAATATCT CTGCCATTTT TTATAACACT AACGCTTGTT TCATCACAAC TTGTTTcAAC 1620 

AGCTAGTATT AATATATCTT TAGTCATTTA AATTCACCCA CATAACCATT GCGTCCTCAC 1680 

CTTCACCATA ATAATTTTTA CGTTTACCAC CATATTGAAA TCCTAAATTT TCATATACAT 1740 

GTTGTGCCAC TTTATTATTA ACTCTTACTT CTAAACTCAT CACATCACAA GTGTGACTTG 1800 

CATAGTTTAT TCCGTATTTT AAAAGCATTT GACCTAAACC ATAGCCTCTA TAATTATCAT I860 

CGATTGCAAC TGTTGTAATT TGAGCTTGAT CGATAACAAT CCATAAACCT AAATAACCAA 1920 

TAATTTGTTG TTCAAATTCt AAGACAAAAT ATTTCGCAAA GTTATTTTGC TCTATTTCAT 1980 

GATAAAATGC GTCAATTGTC CAAGAACTGT CATTGAAACT CCGACGCTCA AGATCAAAGA 2040 

CTTQTGGCAC ATCTTCTTTA GTCATCTCTC TAATGTTTAA TTGTTCTTTT GACTGTTGAT 2100 

40 CCAATTTCGT TCCGCCTCAG CTAATTTATG GTATTTAGGA GTAAATGTAT GTACGTCTGA 2160 

AGGTTTATCT AGCAATTGAT ACATGACTGA TGCATTTGGT AGctGCGCAA TCACTTCACC 2220 

TTGTAATTCA TCTTGTAATT TTACAGTATC TTTCCCAATA TAAATAAATG GTTGGTTTAA 2280 

ATCTTCTAAA AAAGCTCGCA ATGCCTCTAT CGACATATAT TGATCTTCTA AAATAGTCAC 2340 

TAATTGACCA TTTTGCCACT GGAATATGCC TGTATAAACT GCTTGTCGTC TTGCATCAAA 2400 

CACAGGAACC AATAATTTAT CAGTATGATC GATTGTTGCT GCCAATGCCT TTAATGATGA .2460 

AACACCATAT AATTTAACAT CTAACGCATA CGCTAATGTT TTAGCAACAG TAACACCGAT .2520 

ACGTAAGCCA GTATATGAAC CAGGACCTTC AGCAACAATA ATCGCATCTA ATTGCTGTTT 2580 
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TTGTTTAGAA TCCGTAGTTA TTTCAGCTAA AACTTCATCG TTTTGCATCA ATGCTACTGA 2700 
TAATGGTTGA TTCGATGTAT CAATGAGCAG CGAATTCATG GATAATTGCC TCCTTAATTT 2760 
GTTCATAATG TTCTCCTTGC GCGAACAACT CAATTTGTCT TGTATTTTCA GATATTGTTG 2820 
AAATGTTAAT AGATAAATGC GTCGCTGGAA GTAAATCTTT TATAAATTGA CTCCATTCAA 2880 

TAACAGTAAT TGCCTGATCT TCGAAAAATT CATCAAATCC TAAATCTTCA TCAGAATCTT 2940 

CTAAGCGATA ACAATCCATA TGATGCAATT TTAAATTTTT ACCCCTATAT GATTTAATGA 3000 

TGTTAAATGT CGGGGAATTA ATCGTACGTC TTACACCAAG AGCTTTTCCT ATAAATTGCG 3060 

TTAACGTTGT TTTACCTGCT CCTAAATCTC CGTTAAGTAA AATCAAATCA CCA CT TTTCA 3120 

ATTGCTCAAC TAAAAATATA GCAAATTGAT TCATTTCATC TAAATTATTT ATCTTTATCA 3180 

ATGTTGATTC TCCTATATTA TGCTTTTCAT TCATAAAAAT GATTATCCAT TGTTCAATCG 324 0 

20 TATCTAACTT TATATTTAAC CTTTATATTG TAACAAATTT CAACTTAAAT TTCTTATCTT 3300- 

TGAAACAGAT TATCTATTCA AAGTTAATTG TAAGAAAATT TAAAATATTT GTTGACATAC 3360 

TAAAGCAGAT ATAGTAAATT AAATTTATCA AATTTTTAGA CAATTCTAAC TATTAAAGTG 3420 

ATATATACCA TTCACGGAAG GAGTATAATA AAATGCTTAA TCAATATACT GAACATCAAC 34 80 

CGACAACTTC AAATATTATT ATTTTATTAT ACTCTTTAGG ACTCGAACGT TAgTAAATAT 3 54 0 

TTACTAAACG CTTTAAGTCC TATTTCTGTT TGAATGGGAC TTGTAAACGT CCCAATAATA 3 600 

TTGGGACGTT TTTTTATGTT TTATCTTTCA ATTACTTATT TTTATTACTA TAAAACATGA 3 660 

TTAATCATTA AAATTTACGG GGGAATTTAC TATGCGAaCG AgcATGATCA AAAAAGGAGA 3720 

TCACCAAGCA CCAGCAAGAA GTCTTTTACA TGCCACGGGC GCGCTAAAAA GTCCAACTGA 3780 

TATGAACAAA CCATTTGTAG CTATTTGTAA CTCTTATATT GATATTGTTC CTGGACATGT 3 84 0 

TCACTTGAGA GAGCTTGCAG ATATAGCTAA AGAAGCAATT AGAGAAGCCG GTGCCATTCC 3 900 

40 ATTTGAATTC AATACAATTG GTGTTGATGA TGGAATAGCT ATGGGACATA TCGGAATGCG 3960 

ATATTCTCTA CCATCACGTG AAATTATTGC AGATGCAGCT GAAACTGTAA TTAACGCTCA 4020 

TTGGTTTGAC GGCGTATTTT ACATTCCTAA TTGTGACAAG ATTACACCCG GTATGATTTT 4080 

45 AGCAGCCATG. AGGACAAACG TACCAGCTAT CTTTTGCTCT GGTGGACCAA TGAAAGCTGG 414 0 

CTTATCTGCA CATGGAAAAG CATTAACACT TTCATCAATG TTTGAAGCAG TCGGCGCATT 4 200 

TAAAGAAGGA TCGATTTCTA AAGAAGAATT TTTAGATATG GAACAAAATG CCTGCCCTAC 4260 

TTGTGGTTCA TGTGCTGGGA TGTTTACTGC AAATTCAATG AACTGTTTGA TGGAAGTTTT 4 320 

AGGTCTAGCA TTACCATACA ACGGTACTGC ACTTGCAGTC AGTGATCAGC GACGAGAAAT 4380 
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TATCGTTACT CGCGAAgCAA TTGATGATGC ATTTGCACTT GATATGGCTA TGGGTGGTTC 4500 

AACAAACACG GTACTGCATA CGTTAGCCAT TGCCAATGAA GCTGGTATTG ATTATGACTT 4560 

AGAGCGCATT AATGCTATTG CCAAACGCAC GCCATATTTA TCAAAAATAG CACCTAGTTC 4620 

ATCGTATTCA ATGCATGATG TGCATGAAGC TGGTGGCGTC CCAGCAATTA TTAATGAATT 4680 

GATGAAGAAA GATGGCACGT TACACCCAGA TAGAATCACA GTTACTGGCA AAACGTTACG 4740 
TGAAAATAAC GAAGGCAAAG AAATTAAGAA CTTTGATGTC ATTCACCCTC TTGATGCACC . 4800 

ATATGATGCA CAAGGCGGTT TATCTATCTT ATTTGGTAAT ATCGCCCCTA AAGGCGCAGT 4860 

TATTAAAGTT GGCGGCGTTG ATCCATCTAT CAAAACATTT ACTGGGAAAG CAATTTGTTT 4920 

CAATTCGCAT GATGAAGCTG TTGAAGCAAT AGACAATCGT ACCGTTCGTG CAGGCCACGT 4 980 

CGTTGTCATT AGATATGAAG GACCTAAAGG TGGACGAGGT ATGCCTGAAA TGTTAGCACC 504 0 

20 TACTTCCTCT ATTGTTGGTC GCGGCTTAGG TAAAGATGTT GCATTAATTA CTGATGGGCG 5100 

TTTTTCCGGT GCCACAAGAG GTATTGCAGT TGGTCATATT TCCCCTGAAG CTGCATCTGG 5160 

TGGACCAATT GCCTTAATTG AAGATGGTGA TGAGATTACT ATTGATTTAA CAAATCGTAC 5220 

ATTAAACGTA AACCAGCCTG AAGATGTTCT AGCGCGTCGC CGAGAATCTT TAACACCATT 5280 

TAAAGCGAAA GTAAAAACAG GTTATCTAGC TCGTTATACT GCCCTAGTAA CTAGCGCAAA 5340 

TACAGGTGGC GTCATGCAAG TCCCTGAGAA TTTAATTTAA TTTATTTTTA TATTGGAGAT 5400 

GGTTAAAATG TCTAAAACTC AACATGAAGT AAACCAAAAT ATTGACCCTT TAAAAATGGC 5460 

TGAATCACTT GAACCTGAAC AACTAAATGA AAAAACTTTA AATGATATGC GTTCAGGATC 5520 

AGAAGTGCTA GTAGAAGCTC TACTTAAAGA AAATGTGGAT TATTTATTCG GTTATCCTGG 5580 

TGGTGCCGTA CTACCTTTAT ATGACACGTT TTATGATGGT AAAATCAAAC ATATTTTAGC 5640 

AAGtfCACGAA CAAGGTGCTG TTCATGCTGC AGAAGGTTAT GCACGTGTAT CTGGTAAamT 5700 

40 GGCGTCGTTG TAGTTACAAG CGGTCCaGGT GCAACTAATG TAATGACAGG TATTACGGAT 5760 

GCACATTGCG ACTCTTTACC TCTAGTTGTA TTCACTGGAC AAGTTGCTAC ACCAGGCATT 5820 

GGTAAAGATG CATTCCAAGA AGCGGATATT CTATCTATGA CTTCACCAAT TACAAAACAA 5880 

45 AATTATCAAG TGAAACGTGT TGAAGATATC CCTAAAATCG TACACGAAGC TTTCCATGTA 5 940 

GCTAATTCTG GACGCAAAGG TCCTGTAGTG ATTGATTTTC CAAAAGATAT GGGTGTTTTA 6000 

GCTACAAATG TGGATTTATG CGACGAAATC AATATTCCAG GTTATGAAGT TGTTACAGAA 6060 

SO 

CCAGAAAATA AAGACATTGA CACTTTCATC TCACTTTTAA AAGAAGCGAA AAAGCCTGTC 6120 

GTATTAGCCG GCGCAGGTAT TAATCAATCA AAATCAAATC AATTATTAAC ACAGTTTGTT 6180 
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GATACACTAT TTTTAGGTAT GGGAGGAATG CATGGTTCTT ATGCTAGTAA CATGGCATTA 6300 

ACTGAGTGTG ATTTACTCAT TAATTTAGGT AGCCGCTTCG ATGATAGATT AGCAAGCAAA 6360 

CCTGATGCCT TTGCACCTAA CGCCAAAATT GTACATGTAG ATATTGATCC TTCAGAAATC 6420 

AATAAAGTTA TTCATGTAGA TTTAGGTATT ATTGCAGACT GTAAAAGATT TTTAGAATGT 6480 
TTAAATGATA AAAATGTTGA GACTATAGAA CACAGTGACT GGG TTAAACA TTGTCAAAAT ' 654 0 

AATAAGCAGA AACACCCATT TAAACTTGGT GAAQAAGATC AAGTATTTTG TAAGCCACAA 6600 

CAAACAATCG AATATATCGG CAAAATTACA AATGGTGAAG CAATTGTTAC TACAGACGTG 6660 

GGACAACATC AAATGTGGGC AGCTCAATTT TATCCATTTA AAAATCACGG ACAATGGGTT 6720 

ACAAGCGGTG GTTTAGGAAC AATGGGATTC GGTATTCCTT CGTCAATTGG TGCCAAATTA 6760 

GCTAATCCTG ATAAAACAGT CGTATGTTTC GTCGGTGACG GTGGTTTCCA AATGACAAAC 684 0 

20 CAAGAAATGG CACTTTTACC CGAATATGGT TTAGATGTCA AAATCGTACT AATCAATAAT 6900 

GGAACATTAG GTATGGTTAA ACAATGGCAA GATAAGTTCT TTAATCAACG CTTCTCACAC 6960 

TCAGTATTTA ATGGTCAACC TGATTTTATG AAAATGGCAG AAGCATATGG CGTCAAAGGT 7020 

TTCTTAATCG ATAAGCCAGA ACAACTGGAA GAACAATTAG ATGCAGCGTT TGCTTATCAA 7080 

GGACCAGCTT TAATTGAGGT TCGTATTTCC CCTACTGAAG CTGTAACCCC AATGGTTCCG 714 0 

AGTGGCAAAT CAAATCATGA AATGGAGGGC TTATAATGAC AAGAATTCTT AAATTACAAG 7200 

TTGCGGATCA AGTCAGCACG CTAAATCGAA TTACAAGTGC TTTTGTTCGC CTACAATATA 7260 

ATATCGATAC ATTACATGTt ACACATTCTG AACAACCTGG GATTTCTAAC ATGGAAATTC 7320 

AAGTCGATAT TCAAGATGAT ACATCACTTC ATATATTAAT TAAAAAATTA AAACAACAAA 7380 

TTAATGTTTT AACGGTTGAA TGCTACGACC TTGTTGATAA CGAAGCTTAA TTTTAAGACA 7440 

AAGGCAATGA TGCGCTAATT AGTTATAGAT ATATCATAGG CTGCTAGTTA ACATCTGCCA 7500 

40 CTATTACAAA GTTATATTTC AGAATTTTCG AAACACAAAA TATTTAATTA TTTGGAGGAA 7560 

TTTATTATGA CAACAGTTTA TTATGATCAA GATGTAAAAA CGGACGCTTT ACAAGGCAAA 7620 

AAAATTGCAG TAGTAGGTTA TGGATCACAA GGTCACGCGC ATGCACAAAA CTTAAAAGAC 7680 

45 AATGGATATG ATGTAGTCAT CGGCATTCGC CCAGGTCGTT CTTTTGACAA AGCTAAAGAA 7740 

GATGGATTTG ATGTGTTCCC TGTTGCAGAA GCAGTTAAGC AAGCTGATGT AATTATGGTG 7800 

CTATTACCTG ATGAAATTCA AGGTGATGTA TACAAAAACG AAATTGAACC AAATTTAGAA 7860 

AAACATAATG CGCTTGCATT TGCTCATGGC TTTAACATTC ATTTTGGTGT TATTCAACCA 7920 

CCAGCTGATG TTGATGTATT TTTAGTAGCT CCTAAAGGAC CGGGTCATTT AGTTAGACGT 7980 
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CAAGCACGTA ATATTGCTTT AAGTTATGCA AAAGGTATTG GTGCAaCTCG TGCAGGTGTT 8100 

ATTGAAACAA CATTTAAAGA AGAAACTGAG ACAGATTTAT TTGGTGAACA AGCAGTACTT 8160 

TGCGGTGGTG TATCGAAATT AATTCAAAGT GGCTTTGAAA CATTAGTAGA AGCGGGTTAT 8220 

CAACCAGAAT TAGCTTATTT TGAAGTATTA CATGAAATGA AATTAATCGT TGATTTGATG 8280 

TATGAAGGCG GTATGGAAAA TGTACGTTAC TCAATTTCAA ATACTGCTGA ATTTGGTGAC 8340 

TATGTTTCAG GACCACGTGT TATCACACCA GATGTTAAAG AAAATATGAA AGCTGTATTA 84 00 

ACTGATATCC AAAATGGTAA CTTCAGTAAT CGCTTTATCG AAGACAATAA AAATGGATTC 8460 

AAAGAATTTT ATAAATTACG CGAAGAACAA CATGGTCATC AAATTGAAAA AGTTGGTCGT 8520 

GAATTACGCG AAATGATGCC TTTTATTAAA TCTAAAAGCA TTGAAAAATA AGATAGACCT 8580 

ACAATGAGGA GTTGTTAAAT ATGAGTAGTC ATATTCAAAT TTTTGATACG ACACTAAGAG 8640 

2Q ACGGTGaACA AACACCAGGA GTGAATTTTA CTTTTGATGA ACGCTTGCGT ATTGCATTGC 8700 

AATTAGAAAA ATGGGGTGTA GATGTTATTG AAGCTGGATT TCCTGCTTCA AGTACAGGTA 8760 

GCTTTAAATC TGTTCAAGCA ATTGCACAAA CATTAACAAC AACGGCTGTA TGTGGTTTAG 8820 

25 CTAGATGTAA AAAATCTGAC ATCGATGCTG TATATGAAGC AACAAAAGAT GCAGCGAAgC 8880 

CGGTcGTGCA TGTTTTTATA GCAACATCAC CTATTCATCT TGAACATAAA CTTAAAATGT 8940 

CTCAAGAAGA CGTTTTAGCA TCTATTAAAG AACATGTCAC ATACGCGAAA CAATTATTTG 9000 

ACGTTGTTCA ATTTTCACCT GAAGATGCAA CGCGTACTGA ATTACCATTC TTAGTGAAAT 9060 

GTGTACAAAC TGCCGTTGAC GCTGGAGCTA CAGTTATTAA TATTCCTGAT ACAGTCGGCT 9120 

ACAGTTACCA TGATGAATAT GCACATATTT TCAAAACCTT AACAGAATCT GTAACATCTT 9180 

CAAATGAAAT TATTTATAGT GCTCATTGCC ATGACGATTT AGGAATGGCT GTTTCAAATA 9240 

GTTTAGCTGC AATTGAAGGC GGTGCGAGAC GAATTGAAGG CACTGTAAAT GGTATTGGTG 9300 

AACGAGCAGG TAATGCAGCA CTTGAAGAAG TCGCGCTTGC ACTATACGTT CGAAATGATC 9360 

ATTATGGTGC TCAAACTGCT CTTAATCTCG AAGAAACTAA AAAAACATCG GATTTAATTT 9420 

CAAGATATGC AGGTATTCGA GTGCCTAGAA ATAAAGCAAT TGTTGGCCAA AATGCATTTA 9480 

45 GTCATGAATC AGGTATTCAC CAAGATGGCG TATTAAAACA TCGTGAAACA TATGAAATTA 9540 

TGACACCTCA ACTTGTTGGT GTAAGCACGA CTGAACTTCC ATTAGGAAAA TTATCTGGTA 9600 

AACACGCCTT CTCAGAGAAG TTAAAAGCAT TAGGTTATGA CATTGATAAA GAAGCGCAAA 9660 

50 

TAGATTTATT TAAACAATTC AAGGCCATTG CGGACAAAAA GAAATCTGTT TCAGATAGAG 9720 

ATATTCATGC GATTATTCAA GGTTCTGAGC ATGAGCATCA AGCACTTTAT AAATTGGAAA 9780 

55 



30 



35 



40 



605 



EP0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



50 



AAGAGGGTCA 
ATGCAGTTGA 
TCACTGAAGG 
CTGTCAATGG 
AAGCACATGC 
TAACATTGTT 
ATTGCTTGAA 
TGGTGGTGCC 
TAAAAGAGCA 
CAATCGACCA 
ACGCCCCACT 
TGAAGGCACA 
TAGACATTTT 
ACGCATTGTT 
TGATAAAGAA 
TCAATTATAT 
AATCACAAAT 
AAGTGATGAA 
TAACGATGGT 
AAACGTTGCC 
AAATCAACCA 
GCAAACGACA 
TCAAAAATTG 
GGAACAGACA 
ACCTTATACA 
AATTAAGACG 
TATTCAATAT 
TAGATTTTGG 
TAGGAC CTGA 



TATTTACCAG 
TCGTATTTTC 
TACTGATGCC 
CTTTGGTATT 
TAAATTTGCA 
GCCCTACCTG 
ATTATAAGTA 
TCTATTGATA 
GATGCTATTT 
GAACAAGGAT 
ACCGTTGTCA 
GATTTAGTTA 
AATAATCACG 
CACGTAGCAT 
AATGTATTAG 
CCAGAAGTAA 
CCAAAACAAT 
GCTTCAGTGA 
CCAAGATTGT 
AATCCATTTG 
GATGCTGCAG 
GCAGATTTAG 
AATCACTAAG 
TGTGTTATAC 
TGAAGTTACT 
CCCAGATTTA 
TAAAGATGAA 
GGTGCATATT 
GACAGGACTT 



GATTCAAGTA 
CAGAAAGAAA 
CAAGCAGAAG 
GATCATGATA 
GCTGAAAATG 
GTGATGGAAT 
ATAAATATAA 
CATTCGGCGA 
TACTGGGTGC 
TATTAAAATT 
AAGGCGCTAG 
TAGTCCGTGA 
AGGCCTTAGA 
TTAAATTGGC 
CTTCTAGTAA 
CAGTAAATCA 
TTGACGTCAT 
TTCCTGGTTC 
ATGAGCCTAT 
GAATGATTCT 
ATGAATTAGA 
GCGGCAAATT 
GGGGAGATGT 
GGGAAATTGG 
TCTCCTCAAG 
ACATTTGCAA 
ATTGCAAACA 
TTTGATATGG 
ACACAGCCTG 



TTGGTACTGG 
CAGAATTAAT 
TACATGTAAA 
TTTTACAAGC 
TTGAGAAGGT 
CGGTCCAGAA 
CTTTAATTAT 
GCCTTTAACT 
AATCGGTGGA 
GCGTAAATCC 
TTCTTTATCA 
ATTGACAAGT 
TTCTCTTACT 
CGCTTCAAGA 
ATTGTGGCGC 
CTTATTTGTT 
CGTATGTGAA 
ACTTGGTTTA 
TCATGGATCA 
ATCTTTAGCG 
ACAACATATT 
GAATACTACT 
AAATGGGTCA 
GCGAACCGCA 
CATTTGAAGG 
CACTCGATCA 
AACAAATCAC 
GTTCTGATGA 
GCAAGACAAT 



TTCAATCGTA 
TGATTATCGT 
TTTATTGATT 
CTCTTGTAAA 
AGGTAATTAA 
ATTTTGAACG 
CAAATAGAGC 
GAGAAAACCT 
CCTAAATGGA 
TTAAATTTAT 
CCTTTAAAGG 
GGTATTTATT 
TATACAAGAG 
CGAGGAAAAC 
AAAGTCGTAA 
GATGCTTGTA 
AACTTATTTG 
TCACCTTCTG 
GCACCAGATA 
ATGTGTTTAC 
TATAGCATGA 
GATATTTTCG 
AACATTATTT 
ACTATTATAC 
ACTTAGGCTT 
CAATGTTCCT 
AACATTACAA 
ACAAGGTATT 
CGTTTGTGGT 



GCAATTTACA 
ATTAATTCTG 
GAAGGTAAGA 
GCATACGTAG 
TTATGACTTA 
GATCTCTATC 
ACCACGAATT 
TAAATGCGTG 
CAGATCCTAA 
TTGTAAATAT 
AAGAACGCGT 
TTGGAGAACC 
AAGAAATAGA 
TAACATCAGT 
ATGAAGTAAG 
GTATGCATTT 
GCGATATTTT 
CTAGTTTTAG 
TTGCAGGTAA 
GTGAAAGCTT 
TTGAACATGG 
AAATTCTATC 
GACAAGGTGT 
ATTGATTTAC 
CAAAACAGAA 
ACTATTGATA 
AAAAACGCCA 
GTTCACATGG 
GACTCTCACA 



9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10660 
10920 
10980 
11040 
11100 " 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
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ATGTTTTCGC 


AACTCAAACG 


CTATGGCAAA 


CAAAACCCAA AAACTTAAAA ATCGATATTA 


11700 




ATGGTACCTT 


ACCAACAGGC 


GTCTATGCTA 


AGGACATTAT 


TCTGCATTTA 


ATTAAAACGT 


11760 


£ 


ATGGTGTTGA 


CTTTGGTACA 


GGCTATGCTT 


TGGAATTTAC 


TGGCGAAACA 


ATTAAAAACC 


11820 




TTTCAATGGA 


TGGTCGAATG 


ACTATTTGTA 


ACATGGCTAT 


CGAAGGTGGT 


GCCAAATACG 


11880 




GCATAATCCA ACCTGATGAT ATAACATTTG 


AATATGTTAA 


AGGGAGACCA 


TTTGCCGATA 


11940 


10 


ACTtCGCTAA ATCAGTTGAT AAGTGGCGTG AgCTATATTC TGATGACGAC GCGATATTTG 


12000 




ATCGTGTAAT 


TGAACTTGAT 


GTTTCAACAT 


TAGAACCACA AGTGACATGG 


GGAACTAATC 


12060 


15 


CTGAAATGGG 


TGTTAATTTC 


AGTGAACCAT 'TCCCTGAAAT 


CAATGATATC 
x w\ xn x w 


AACGATCAAC 


12120 


GTGCGTATGA 


TTATATGGGG 


TTAGAACCAG 


GTCAAAAAGC 


TGAAGACATC 


GACTT AGGGT 


12180 




ATGTTTTTCT 


CGGTTCATGT 


ACAAATGCTA 


GACTATCAGA 


J i~i~rGA'I u I'GZVA 

x x x \jn x x 


GCTAGTCATA 


12240 


20 


TTGTTAAAGG 


AAATAAAGTT 


CATCCAAATA 


TT ACAG CT AT 


TGTCGTACCA 


UVl X X V* X V*VJ X TX 


12300 




CAGTAAAAAA 


AGAAGCAGAA 


AAATTAGGTC 


TAGATACTAT 


CTTTAAAAAT 


GCAGGATTTG 


12360 




AATGGCGTGA 


ACCAGGATGT 


TCAATGTGTT 


TAGGCATGAA 


TCCTGACCAA 


GTACCTGAGG 


12420 


25 


GCGTACATTG 


TGCATCTACA 


AGTAATCGAA 


ACTTTGAAGG 


ACGACAAGGC 


AAAGGTGCAA 


12480 




GAACACATTT 


AGTATCCCcT 


GCTATGGCAG 


CAGCAGCAGC 


TATTCATGGT 


AAATTTGTGG 


12540 




ACGTAAGAAA 


GGTGGTTGTT 


TAAATGGCAG 


CAATCAAACC 


TATTACAACA 


TATAAAGGTA 


12600 


30 


AAATAGTCCC 


TCTCTTCAAC 


GACAATATCG 


ATACAGACCA 


AATCATTCCT 


AAGGTACACT 


12660 




TAAAGCGTAT 


TTCAAAAAGT 


GGCTTTGGTC 


CATTTGCTTT 


TGATGAATGG 


CGGTACTTAC 


12720 


35 


CTGATGGTTC 


AGATAATCCT 


GATTTCAATC 


CTAACAAACC 


ACAATATAAA 


GGGGCTTCTA 


12780 


TTTTAATTAC 


TGGAGATAAT 


TTTGGATGTG 


GTTCAAGTCG 


TGAACATGCT 


GCTTGGGCTC 


12840 




TTAAGGACTA 


TGGTTTTCAT 


ATTATTATTG 


CAGGAAGTTT 


CAGTGACATA 


TTTTATATGA 


i o a a a 
12900 


40 


ATTGCACTAA AAATGCGATG 


TTGCCTATCG 


TTTTAGAAAA 


AAGTGCCCGT 


GAACATCTTG 


1 O fit A 

12960 




CACAATATGT 


TGAAATTGAG 


GTCGATTTAC 


CAAATCAAAC 


TGTGTCATCA 


CCAGACAAGC 


13020 




GTTTCCATTT 


TGAAATTGAT 


GAAACTTGGA 


AGAATAAACT 


TGTAAATGGC 


TTAGATGACA 


13080 


45 


. TTGCAATCAC 


CCTACAATAT 


GAATCATTAA 


TAGAAAAATA 


TGAAAAATCa 


CTTTAAGGGA 


13140 




GTTGAATATT 


ATGACAGTCA 


AAACAACAGT 


TTCTACGAAA 


GATATCGATG 


AGGCATTTTT 


13200 




AAGACTTAAA 


GATATTGTCA 


AAGAAACACC 


TTTACAATTA 


GACCATTACT 


TATCTCAAAA 


13260 


50 


GTATGATTGT 


AAAGTCTATT 


TAAAACGAGA 


AGATTTACAA 


TGGGTACGTT 


CTTTTAAATT 


13320 




AAGAGGTGCT 


TACAACGCTA 


TTTCTGTTTT 


ATCAGATGAA 


GCTAAAAGTA 


AAGGTATTAC 


13380 
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AAACGCTGTT 


ATCTTTATGC 


CAGTCACTAC 


ACCTTTACAA 


AAGGTAAATC 


AAGTAAAGTT 






CTTTGGAAAT AGTAACGTTG AAGTTGTACT 


CACTGGTGAT 


ACATTTGATC 


ACTGTTTAGC 




5 


TGAAGCTTTA ACTTATACAA GTGAACATCA AATGAACTTT ATAGATCCAT 


TCAATAATGT 


1 **£"5rt 

1 Jv&U 




TCATACAATT 


TCTGGACAAG 


GTACGCTTGC 


TAAAGAAATG 


CTAGAACAAG 


CAAAGTCTGA 


i icon 


10 


CAATGTTAAC 


TTTGATTATC 


TATTTGCCGC 


AATTGGTGGT 


GGCGGTTTAA 


111 V«rf\w\J X f\ J 


i "7. 'J a n 


TAGTACTTAC 


TTTAAAACCT 


ATTCACCTAC 


CACGAAAATT 


ATAGGTGTTG 


AAPPTTPAnn 
Anv« CI 1 


t -a q nn 

1JOUU 




TGCAAGTAGT 


ATGTATGAAT 


CTGTTGTGGT 


AAATAATCAG 


GTAGTCACAT 




lJobU 


15 


CGATAAATTT 


GTGGACGGTG 


CATCTGTAGC 


TAGAGTTGGC 


GATATTACAT 


T*Pf2 Zi Zi & TTYIP 






AAAAGAAAAT 


GTAGATGATT 


ACGTTCAAGT 


AGATGAAGGT 


GCAGTTTGTT 


CTACGATTTT 


13980 




AGATATGTAT 


TCAAAACAAG 


CAATTGTAGC 


AGAACCTGCT 


GGCGCATTAA 


t "1 V ** 1' A 7V t ^ 

ulvj 1 AAo X VjL 


14040 


20 


GCTTGAAAAC 


TATAAAGATC 


ATATTAAAGG 


TAAAACAGTG 


GTTTGTGTCA 


1 1 ri\a X X 


14100 




TAATAATGAT 


ATTAATCGAA 


TGAAAGAAAT 


TGAAGAACGT 


TCATTACTAT 


ztrv a a/^a a 7\«"p 


14160 




GAAGCATTAC 


TTTATCTTAA 


ATTTCCCTCA 


ACGTCCAGGT 


GCATTGAGAG 


#v\X X lulnAA 


14220 


25 


TGACGTATTA 


GGACCTCAAG 


ACGATATTAC 


TAAATTTGAA 


TACTTAAAAA 




14280 




AAATACAGGT 


ACTGTCATTA 


TTGGTATTCA 


ACTTAAAGAT 


CATGATGATT 


X X ALAAL X 


14340 




CAAACAACGT 


GTAAAtCATT 


TCGATCCTTC 


CAATATTTAT 


ATTAATGAAA 


X X MMajiA X \s X X 


144 00 


30 


ATATTCATTG 


TTAATTTAAC 


ACATAGTAAG 


AAAAACAGTC 


ATAAATTGAT 


11^1 AA X XTJA 


14460 




AATCATCTTA 


TGACTGCTTT 


TTATTATACT 


TTACATTTCT 


CGTTTCGTCA 


wi 1 1 LAAALu 


14 520 


35 


TTTTCACTTC 


GCCAAGCCAT 


CTTTCTTTGT 


GTTTGCTTTT 


aTTTTGACGT 




X.4 SOU 


AAAAAaGAGA 


CCTTGCGGTC 


TCAATGCGGC 


TCATCGCATC 


CACTTTTTGC 




1 A CA O 




TCTACTCTAG 


CGGAACGTAA 


GTTCGaCTAC 


CATCGACGCT 


AAGGAGCTTA 




i a *7 n n 


40 


TCGGCATGGG 


AACAGGTGTG 


ACCTCCTTGC 


TATAGTCAC C 


AGACATATGA 


ATOTAATTTA 

.nxvJX-nnX X X r\ 


it / vU 




TACATTCAAA 


ACTAGATAGT 


AAGTAAAAGT 


GATTTTGCTT 


CGCAAAACAT 


TTATTTTfS AT 






TAAGTCTTCG 


ATCGATTAGT 


ATTCGTCAGC 


TCCACATGTC 


ACCATGCTTC 


CACCTCGAAP 


14 880 


45 


CTATTAACCT 


CATCATCTTT 


GAGGGATCTT 


ATAACCGAAG 


TTGGGAAATC 


TCATCTTGAG 


14940 




GGGGGCTTCA 


TGCTTAGATG 


CTTTCAGCAC 


TTATCCCGTC 


CACACATAGC 


TACCCAGCTA 


15000 




TGCCGTTGGC 


ACGACAACTG 


GTACACCAGA 


GGTATGTCCA 


TCCCGGTCCT 


CTCGTACTAA 


15060 


SO 


GGACAGCTCC 


TCTCAAATTT 


CCTACGCCCA 


CGACGGATAG 


GGACCGAACT 


GTCTCACGAC 


15120 




GTTCTGAACC 


CAGCTCGCGT 


ACCGCTTTaA 


TGGGCGAACA 


GCCCAACCCT 


TGGGACCGAC 


15180 
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GTGGAACTT 15249 
(2) INFORMATION FOR SEQ ID NO: 103: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14051 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

70 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 



15 


GTGGCAATAT 


TTCTAGTTCT 


CGTTTTGATA AGATTTTAAA AGGATCTGTT 


GTGTTTGCAG 


60 




TGTCCTGATT 


TGAATTAGAT 


ACAAATTCAT 


TCACTAAAGA 


TGTTGTAAGT 


TTCATATCTA 


120 




CATATGTTTC 


ACCTTTATAT 


ACAGTTCGAA 


TAGCTAACAA 


TAATTGTTCA 


TCAGGTGCAT 


180 


20 


TTTTCAATAT 


GTAACCTTTC 


GCACCATTAC 


GCAACACATG 


GAACAAATAC 


TCCTCATCAT 


240 




CAAACATTGT 


TAATATTAGT 


ATTTTAGTTT 


CAGGAAAACT 


GTCAGCAATT 


TTACTCGTAG 


300 




CGATAAGACC 


TGACTCACCT 


GGTGGcATAC 


TTAAATCCAT 


TAGTAACACA 


TCAGGTTTAt 


360 


25 


ATTCCATTAC 


TTTTTGGTAA 


GCTTCGACGC 


CATCTGCAGC 


CGTTGCAACA ACTTCCATAT 


420 




CATTTTGATA 


ATTTAAAATC 


ATAGAGAACC 


CCGTACGGAC 


AACAGCGTGA 


TCATCGGCAA 


480 




TGACTATTTT 


CAATTTTATT 


CCCCCAATGT 


ATGTTTCAAA 


TTGGAATGTT 


CAATGTAACA 


540, 


<!/) 


TTGGTACCCT 


CACCAATTTT 


CGTTTCAATA 


TTGACGCTAC 


CGCTGACTAA 


CTCAGCTCGC 


600 




TCATTCATTC 


CATATAAACC 


GAGTCCAGAA 


CCTTTAGGCT 


TAGAACTTGG 


ATCAAAACCA 


660 


35 


TTTCCCGCAT 


CTATCACTTC 


TGCTACCAAA 


TGGCGCCCAG 


TTTGACGGAT 


ACCTACATTT 


720 


ATTTCATTTA 


CATCAGCGTA 


TTTCAACGCA 


TTTAAAATAG 


CTTCTTGCAC 


TACTCGATAA 


780 




ACAACCGTTT CAATATCACT 


ATCAAAGCGA 


GTATTTTTAA 


TATTTGATGT 


ATATATGATT 


840 


40 


TTTATTCCAT 


AATTTTCTTC 


AAACTGTTTA 


AAATATGATT 


TAAAAGCTGC 


TTCAAGGCCT 


900 




AGATCATCCA AAGAAGCGGG 


TCTTAATTCA 


ACCGACATAT 


TACGTATATC ATCAATTAAT 


960 




TTAGCGACAA 


TATATTCAAT 


ATTTTCTGCG 


TCTTCCAAAA 


GCTTAGTTGT ATCTTCTTGA 


1020 


45 


TATTTTAATA 


ATCTCAATTG 


AACATCTACA 


TTGAGCATTT 


CTTGAATCAC 


ACTATCATGT 


1080 




AACTCTCTAG 


AAATTCGCTT 


TCTTTCATTT 


TCTTGGGCTG 


AGATTGTTTT 


ACGCATCATA 


1140 




CGTTGTTGAT 


GCAATTTCTC 


TTGCTGTTCA 


ATTTGTGATG 


AAACATTTTG. 


AAGCGTAAAT 


1200 


50 


GCATGAATTC 


CCCTGTCTTG 


ATCAATCAAC 


TGATATGTTG CTGTAAATGG 


CATCACTTTT 


1260 




TGATCTTTCG 


TCTTCATAAA 


TACTTGGAAA 


TTCGTAGCTT 


GTACTTGCAT 


CGATTCTAAG 


1320 
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ATCGCATTCG CCACAGCACT GTAATTATCT TCTTCAGATA ATATATCTTT AGCAGCATCA 144 0 

TTCATTGCAA TAATTTTACC GTTATCATCA GCAAAAACTA TCTTTTCGAT TGAATGCTCA 1500 

TAATATTTTT TCAATAAAGT ATCTAACTGT ATACTGTCCT CATTAATCAT GACTTACACC 1560 

CTAATTCATC TCATTATTTA TCATCATTGA AAATACCAAA CTTACGTTGA ATATCATCAT 1620 

TATCAAATAT TTTTGGTAAA GGACGACCAT CTCTTTGACC AAATAATAGT ACGCCATACA 1680 

CTTGATTCTT ATACCAAAGC GGCACTGCTA AAACTGCTGT TAATGATTCG CTCAATAAAA 1740 

TTGGATAGTC AATCTTTTCT TCAGGCCCTA AAGCTAAACC AACATTGGCT ATT AC CAT AC 1800 

GCTTTCCTGT TTTCATAACA GTTCCAGCTA ATCCACGACC TTTTCTTAAA ATAATCAATT I860 

TAAATCGATT ATTTTTATTA CCTGAAACAT AGTGCCATTT TATTGGAGAT GATGGTTTGT 1920 

TAGATTCATA GAAAGCGATT GCCGCAAAAT CATAACCCTC TTCTTTGCGT ATTTTATCTA 1980 

ATGTCTCTTG AAATCTACGA TCTTCAATTA TTGCTTCTGG TGTCAAATCC TTTCACCTCT 2040 

TATGCTTACA CTTTATTCTT ACGGTAAATA ATATATCTGC GATTTATATA TGTCAAAGGT 2100 

ACACTCCAAA CATGCACCAA ACGTGTAAAT GGCCAACAAG CCATAATAGT GAAACCTAAC 2160 

25 AATATATGCA TTTTAAATGC AATCGGCACA CCACTCATCA ATGACGCATC TGGTTTTAAC 2220 

ATAAATAATT GTCTAAACCA AATTGATAAT GAAGTTCTGT AGTTAAAGTC TGGATGTTGT 22 80 

ATATTTGTTA CTAATGTTGC GTAACATCCC ATAAATACGA TAAGTAATAA TAAGAAATTT 2340 

ACAAATATAT CCGACGCTGA ACTTAATCTT CGAATACTTT TCGTAGTAAC ACGTCTCGCT 2400 

GTTAATAAAA ACATCCCTAT CAAAGTTATT ATACCAAAGA TGCTACCAAT ATAAACAGCG 2460 

CCTATATGAT ATAAATGCTC AGACACACCC ACTGCATCCA TCCATGGTTT CGGTATTAAC 2520 

AATCCAACTA CGTGTCCAAA AAACACTGGA ATAATACCTA AGTGAAATAA TAAACTTCCC 2580 

CACATCAACC TTTTTCTTTC TATTAATTCA CTAGATTTAG CTGTCCAAGA AAATTTATCA 2640 

TAACGATAAC GTGCAATATG ACCTGCGACA AAGACAACTA AACATAAATA CGGAAATATA 2700 

ACCCATAAAA ACTGATTAAG CATGATGTTT CACTCCTTTT GGTGATGTCA AACATAATTT 2760 

CAATGTTTTT CTAAGTGCTT GAATCACATA GGCATATGGA TTGTTATCTT CACCAAGTGC 2820 

45 ATTCGCCATC ACATATGTTC CATCCTCAAT AATCATAATG ATTAATTGAA TATTCTCTTC 2880 

AGCTCTTGGA TCATTTCGCC ATTCTGCCAC TTGCAAAAAT TGAAGCATCA ACGGTAGATA 2940 

ATCAGAAAGT TCATTATCTA CCATTTCTAG TCCAAACATT TCATATAATA CCTTTAATTT 3000 

50 AGCTAACATT TGCCCACGTT CTTTTTGCGT ATCAAATTTG TTATACGTCA TATATAATGG 3060 

TGCTTTTTTC GTAAAATCAA ATGTATCTGT ATAAATCGCT TTGATTTCTG ATAATGAAAA 3120 
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TGTTTCTTCA AAAGTTTTTG GATGAAAAGT 
CATATATCCA AAACTTTCTT GATATTTTTT 
5 CTCCATAGAA ATTCTCATTA TAAATTTCTT 

CACAGCCTTC ACAGTTATCT CCAAAATGCT 
GTGCGTGATA CGTATCTAAA TAGGTTTCTT 

10 

CATATTTGGC TAGTCCTAAT AAACGATACA 
ATCGCTCTAA TCGAGACGTG TCAAATGGCT 

15 TCATCATTGC CATACGTTGT AGGGCTCCTT 

TATTAGCTAA GTATTCAATA GGTAAACGCA 
GATTTTGAGT TGTATTTTTA CCTTCAAAAT 

20 ACCAAACCAT CGGCATCGTT CTAAATTCAG 

TTGCTAACTT ATAAATTGGA GAGTTTTGTG 
CTTTTTCAGC TTGAGCAATG ACTTCTTCGT 

25 TTTCATATAA ATCTTTCTCG TCTACTGCTG 

ATAATAAAAC ACCTAAGTAA CGCATACGTC 
TACCCGCCTC GATTCTCGGG AAACAGAAAG 

30 

TGAAGTAAAC TTTCTTATAT GGACAACCTG 
CTTGGTCAAC TAATACAATG CCATCTTCAT 
ATGCAACGCA ACTTGGATTC AAGCAATGTT 

35 

TTTCGTCAAA TTGGAATTTA ATATCTTCTT 
CTGfAACATG ACCACCTGCT AAGTCATCTT 

40 TATCCCCCGT AATTTCTGAA TACGCTCTAG 

TTGTTAAATG TTCATAATTA TAGTTCCATG 
CTGGGTTATA AAAAATTTTA CCTAAAGCAA 

45 CAAGTTTCCC TTTACGATTT AGTACCCAAC 

GTTTCGGATA CCCTACACCT GGCtTCGTTT 
CTGGACGATT TGTCCaAGTG TTTTTACATG 

50 

TATCTAAATT TAATACCATC GCAAcTTGCG 
ATCTTTCTAA CTGCTACATA TAAATCCCTT 

55 



TAATTTTTCT GGAAAACATA ACTGTTGTGC 3240 

AAAATTATCG AAATTAATCA CGGAAAATCC 3300 

GACCAGTTTT CCCTGAACCT ACTGCAACGC 3360 

CGCCGCCGTA ATTGTATCCT GTACTACCTT 3420 

TGTGTGATGT TGG AATAACA AATCGATCTT 3480 

TGTCTTTAGT TTGGCGCTCG GTTATACCTA 3540 

GTTGAGTAAC TTGAGATCTC ATATAACTTC 3600 

TTACTGGCTC TGTATCTCCT GCAGTGAAAA 3660 
TTTCTTCAAT GGCTGGGAAA ATCGCATCTG " 3720 

AGCTCATAAT TGGGCTAAGT GGTGGGCAAT 3780 

GATGTAACGG AAATGCAAGT TTATATTCAA 3 84 0 

CAGCTTCAAT CCAATCGTAA CCAATACCAT 3900 

CAAATGGGTT TAAGAATATA TCTAATTGTT 3 960 

AAGCTGCTTC ATGAACTCGA TCTGCATCAT 4 020 

CTGTACAAGT TTCAGAGCAT ACCGTAGGCA 4 080 

TACACTTTTC AGCTTTGTTC GTTTTCCAAT 4140 

TCATACAGTA ACGCCATCCA CGACATGCGT 4 200 

CACGTTTATA CATAGCACCT GAAGGACACG 4260 

CACATAAACG TGGTAAATAC ATCATAAAAG 4320 

CTATTTTTTG GATGTTAGGA TCTTTTGGAC 4 3 BO 

CCCAGTTAGG TCCCCATTCA ATTTCAATGT 444 0 

CAACTGGCGA ATGCTTCCCT GATTTCGCAG 4 500 

GCTCATAATA ATCTTTAATT AATGGCATAT 4560 

TTTTTGAAAT TCTACTTCCA GATTTTAATT 4620 

CACCTTTGTA GTGTTCTTGG TCTTCCCAAC 4680 

CTACGTTGTT GAACCACATG TACTCAGCAC 4 740 

TCACACTACA CGTATGGCAT CCTATGCATT 4800 

GTTTAATCTT CAAGCCAATT AACCTCCTTC 4 86 0 

TGGTTCCCAA TTGGTCCATA ATAATTAAAG 4920 
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GGCGCGTTGT GTGAACCACC ACGTGTATCT GTAATTTCTG ACCCAGGCGT TTGAATATGT 5040 

TTATCTTGTG CATGATACAT AAACATTGTA CCTTTAGGCA TACGATGCGA AATAACTGCT 5100 

CTTGCCGTTA CAACACCATT ACGGTTATAC ACTTCTAGCC AATCATTATC TTGGATATCG 5160 

TGTTTTTCAG CATCTTCATT TGATATCCAA ACCGTTGGAC GACCTCTAAA TAGTGTCAAC 5220 

ATATGCTTAT TATCTTGATA CATTGAGTGT ATATTCCATT TTCCATGAGG CGTTAAATAA 52 BO 

CGCAgTACCA AAGCATCTGT ACCACCTTTA ATTTTCTTAT CTCTATTCCC AAATACCATT 5340 

GGCGGCAATG TCGGTTTATA TACTGGTAAG CTCTCCCCAA ATTGTTGGAA AACTTCGTGA 5400 

TCCACATAAT AACTTTGACG TCCTGTTAAT GTTCTAAAAG GTACTAGACG TTCTATATTC 5460 

GTTGTAAATG GTGAATATCG TCGACCTTGT TTATTTGAAC CTGGGAATAC TGCTGTCGGT 5520 

ATTACTTCTC GTGGTTGTGA AGTTATATTT AAAAACGAAA TTTTCTCAGC AGCGCGTTCG 5580 

20 CTAGAAATAT CTTTTAACGG CATTCCAGTT TGTTCTTCGA GATCTTCATA TGATTTTTGT 564 0 

GATAATTTAC CATTCGTAGC AGATGAAATA CTTAGTATTG CATCAGCTAC ATTACGTGCT 5700 

GTATCAATAC GTGGACGATT CGCTCTCACA GAATCATCAT TTGTATCACT CCACGTACCT 5760 

25 AACATACTTT TTAATTCTTC ATATTGTTCA CTGACACCGA AACTTACACC ATGTGCTCCA 5820 

ACTTTCCCTT TTTCAAGTAC AGGACCAAGC GTGACATATT TGTCGTAAAT TTTAGTGTAG 5880 
TCGCGTTCTA CAATTGCAAA GTTAGGCATT GTACGTCCAG GTACCGCTTC AATTTCACCC ' 594 0 

TTCGACCAAT CTTTCACTAC GCCGTATGGT GTTGAAATTT CTTGCTTTGT ATCATGACTA 6 000 

AGTGGAGTTG TCACAACATC TTTAAACGTT CCAGGTAAAT AGTCTTTTGC CATTTCTGAA 606 0 

AATGCTTTTG CCAACGTTTT ATAAATATCC CAGTCTGAAC GCGATTCCCA TAACGGATCA 6120 

ATGGCAGGAT TGAAAGGATG TACATATGGA TGCATATCCG TTGATGATAA ATCATGTTTT 6180 

TCATACCAAG TCGCTGCCGG CAAAACAATG TCAGAATATA ACGGTGTTGC CGTCATTCTG 6240 

AAGTCTAAAG AGACCACTAA ATCTAACTTA CCTGTTGTTT CTTCACGCCA CGTAATTTCT 6300 

TCTGGCTTTT CATCTTCATT TGGTGTAGCT AATAACCCTG ATTTTGTGCC AAGTAAATGC 6360 

TTCATAAAGT ATTCTTGACC TTTTGCAGAA CTTGAAATTA AGTTTGAACG CCATATAAAT 6420 

45 AATGATTTTG GATGATTCTT TTTCAAATCA GGATCTTCTA TTGCAAATTG TGTTTGTTTT 64 80 

GATTTCACTT CATCAATTGC ACGTTGCAAA ATCGCTTCAT TTGAATCTAT ACCTTCATCT 6540 

TTAGCTTCTT CTGCAAACAA CAAACTATTT TTATTAAATT GTGGATATGA TGGTAACCAA 6600 

50 CCAAGTCTAG CTGCTAAAAC ATTATAATCA GCTGGATGTT GATGCTTTAA CTCCTCTGTT 6660 

TTAGCTAATG GAGATTTTAA ACGATCTACA TTTGACTCTT CATATTTCCA TTGGTCTGTT 6720 
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AATGCGACAG TACTCCATCC TTCAATCGGA CGACATTTTT CTTGTCCCAC ATAGTGAGCC 6840 

CAACCGCCAC CATTCACACC TTGACAGCCA CATAACATAA CTAAGTTTAA GATTGAACGA 6900 

TAAATCGTAT CTGAGTTAAA CCAATGGTTA ATACCCGCAC CCATGATAAT CATTGAACGC 6960 

CCTTCAGTAT CGATAGCGTT TTGCGCAAAT TCTTTCGCTA CTTGAATGAC AACACTTTGT 7020 

TTTACGCCTG AAATGGCTTC TTGCCAAGCA GGTGTATATT TTGATTCTGC ATCGTCGTAT 7080 

CCTTTTGATT CTAATTTATG ATCAAAACGA CGCACGCCAT ATTGACTTGC CATTAAGTCA 7140 

AAAATTGTAG CAATACGGAC TTTGTCACCA TTTGCTAAAG TGACTTGTCG AGTTGGAATT 7200 

GGACGATTGA ATATCCCATC TCCATCACTA TCAAAGTATG GGAATTGAAT TGTTTCTAAT 7260 

TCGTATCCAC CTTCTGTCAT TGATAATGTA GGGTTAATTT TAGAACCATC TTCTGTTTCT 7320 

AGTTTTAAGT TCCACTTCTT ACCTTCTTCC CAACGTTGAC CCATTGTGCC ATTAGGTACT 73 80 

20 ACTAAACTAT CGCTGATTGC ATCATGAATA ACTGGCTTCC ATTCGCCTTG CTCTGTTGTT 7440 

TGACCTAAGT CACTCGCTCT TAAAAATCGA CCCGCTTTAT ATCCATTTTC ATCTTCATCC 7500 

AGCATGATAA GAAACGGCAT ATCTGTATAT TGTTTAGCGT AATTTATAAA GCGTTCATTA 7560 

GGTTGATTAA CATAATGTTC TTGTAAAATA ACATGCGTCA TTGCTTGTGC AATTGCAGCA 7620 

TCTGAACCAG GATTCGGTGC TAGCCAGTTA TCTGCAAATT TCACATTTTC TGCGTAATCT 7680 

GGTGCTACTG AAATGACTTT TGTACCTTTA TAGCGGACTT CAGTCATAAA ATGTGCATCC 7740 

GGAGTACGTG TTAAAGGTAC ATTAGAGCCC CACATAATAA TGTATGATGC GTTATACCAG 7800 

TCACTTGATT CAGGCACATC TGTTTGCTCT CCCCAAATTT GTGGAGAGGC AGGTGGTAAA 7860 

TCTGCATACC AGTCATAAAA ACTAAGCATT TCACCACCAA GCAAATTGAT GAATCGAGCA 7920 

CCTGCTGCAT AACTAATCAT TGACATCGCT GGAATAGGTG TAAATCCTGC GATTCGATCT 7980 

GGACCATATT TTTTTATTGT ATACAGTAAT TGTGCTGCGA TTATCTCTGT AACGTCTTTC 8040 

40 CAATTTGAAC GCACGTGCCC TCCCATACCT CGGGCTTGCT TATATTGTTT GGCTTTGTCT 8100 

TCATTTTCAA CAATAGACGC CCATGCAGCA ACGCGATTAC CATTGTTTTC TTCTAATGCT 8160 

TCAGTCCATA AATCCCAGAG TTTTCCACGA ATATATGGAT ATTTGATTCG AAGCGGACTG 8220 

45 TATTCATACC AAGAGAATGA CGCACCTCGT GGACATCCTC TCGGTTCATA TTCAGGCATA 8280 

TCCGGACCAC AACTTGGATA GTCAGTTTGT TGATTTTCCC AGGTAATCAC ACCATTTTTC 8340 

ACAAATACTT TCCAAGAACA TGAGCCTGTA CAGTTAACAC CATGTGTTGT TCTTACTTCT 8400 

50 

TTATCGTGGC TCCAACGTTC TCTGTACATT TTTTCCCATT CTCTACTTTT ACTTTCTAGG 8460 

ATCGACCAAT TCCCATTAAA TTTTTCTGTT GGCTTAAAGA AATTCAATCC AAATTTTCCC 8520 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



TAAAATGCCC 
TAGGGAATCA 
ATACACAATC 
GACAAATAAG 
TTTTTATAAA 
TTACATTTTC 
TGCCATTCGA 
CACAGAGCAT 
CATATGATTT 
GTTTCGCCTC 
CAACAAACTC 
TGTATTGTTG 
ACGTGCCACC 
GACCTGTAGA 
TAGCAACTGc 
CATTTAGTAT 
GTCTAACAAC 
GCTGAATATG 
CATATTGTAA 
TTAAACGCTC 
AAACCTTGCC 
ATAAACATCT 
TTGATCGTGG 
TTGTTTGTGT 
CCCTACCTCT 
AGTTGTCACT 
TCTACGTGCT 
AGCATCCATA 
TACATTTTCA 



AAGACTATTG 
CCTAATTACT 
ATGTATGGTC 
GCTTCAACAC 
TTAGTGACAT 
TTGCTGCTTA 
TGTATCATTT 
AAGGGCGTCA 
AGGAGTGTTT 
AATTGAATAG 
GTTAAAACAC 
TATCTGTGTG 
ATTCACTAAG 
AAATGTCACA 
tGCGCTCGCT 
GTCGACTTCT 
CTTGTTATAT 
TTTTGCATAC 
AATTAACGGA 
AGCCTTTTTC 
ATATTCCTCT 
ACACAACCTT 
AGCGGGCAAA 
GGACAGATAT 
TTGCCTTTGA 
TTAATTTTTT 
TGAGCATTGC 
ATGCGTTCAA 
AATCCAAGTC 



CTTTAATTAG 

TAAGGAATTT 

ATGCTTATTG 

GAATATATTC 

AACACTGTAT 

ATAAATGCAT 

GGGTTTAGCA 

TGTTTAGAAC 

TCAGTATAAT 

TGTTGCAATT 

GATGCTTGAA 

ATAATTTTCC 

GAATTGACAT 

CTTTTAGCAA 

GATGTCACCC 

TCTTGCACAC 

CGACGCGCTG 

GGCTTTTTAC 

TTCACTAATC 

GTCAAATAAT 

ACAGACATAT 

CATCAGGTTC 

ATACATATTC 

TGTGAATCGC 

CAATAACCTT 

CTTTTGTTTC 

TAGTTATTGC 

ATAGTTCATT 

TTCTTAACCA 



ATTGTACATT 

CCCTATCAAT 

CCAATCTAAA 

TCTCGGTTGA 

TAGCATCTGC 

CATAGTAATC 

AACAGCCATA 

CACTTACTAC 

CAACAACTTC 

TTGAAATAAT 

AGACAATCGC 

CTAAACGTTT 

CCACTTCATT 

CTGTACGCAT 

CTGGTACAAT 

GACCAAATAT 

CTTCCACGAT 

CAACATCGAT 

GATCATATAG 

TCGGATTACC 

ATATACGTTC 

TTGAACAATA 

CCCACTCACT 

ATGAATTTTG 

TTTTCCAATT 

CATGTATTAC 

TTCCCAAGGT 

TTGTCTTTCT 

TGGCGCTGTT 



TTTTCACAAA 

AACGGGATTT 

TCGTTCAAAT 

AACCTTACTT 

ACGATCGGTT 

ATATTGCGAC 

ACCTTCGTCA 

ATAAAATTGC 

CCCTATAATA 

ATTACTTAAA 

TATCGGGTAA 

TACCCCCATA 

TTCTTCTGAA 

TGTCAAACCT 

TTCAAACGCA 

CGCTGGATCG 

ACAGTCATTT 

AATTTCAGTA 

AATGACATCC 

TGGACCCGCA 

CCGTCTGTAA 

CCTGTATTTA 

GTCCCTTCAG 

CCACTTTCTG 

AGGGGTGTTA 

ACCTTCTCCA 

TCAGCTTCGA 

GGGTCAAGTA 

CTTTCAGCAT 



CATAAAATAT 

CATTGAAATA 

TTGGCACAAC 

ATTCATTTAT 

GAAATATATG 

GAATGATATG 

TATAAATGTT 

TTCATAGGAT 

CATATACCTG 

CGCCCCTTAA 

TGAATATCTG 

TAAATTGCTA 

TCTTTAAAGT 

GTCTGCATAG 

ATATGATGTT 

CCACCTTTAA 

ATTTTTTCTT. 

GTCAAATTCG 

gCTTCACGTA 

CCTATCAAGT 

CTTCTACCTC 

AATCAATTTT 

ACAATGGTCC 

TTAAAAACAA 

ATTCATCTAT 

CTTCAAAAAT 

CTGCTTTTTT 

AGACTTCTTT 

ATATACCTGT 



8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



AGTTGTTAAA AATTCAGCTT TTTCAACTTC TGTACCACCA TTACCACCGA TATAGATTTG 10440 

GAATCCATTT TCAACTGAGA TAATACCAAA ATCTTTAACA CCTGATTCAA CACAACTTCT 10500 

TGGGCAGCCT GATACACCCA TTTTGAATTT ATGAGGTGTA TCGATGTATT CAAATGTTTT 10560 

TTCTAAACGA ATGCCAAGTC GTGTCGTGTA TTGCGTACCA AATCGACAAA ACTCTTTACC 10620 

AACACAGCTT TTAACTGAGC GTGTTTTCTT ACCATAAGCT GATGcTGAAC GCATACCTAG 10680 

GTCTTCCCAT ATATTTGGTA ATTCTTCTTT TTTAACTCCA TACAAACCAA CACGTTGTGA 10740 

ACCTGTCACT TTAACTAGTG GCACATGATA TTTCTTAGCC ACTTCTCCTA GACGAATCAG 10600 

TTGGTCTGCA TCTGTAACAC CCCCACGCAT TTGAGGTATA ACAGAAAATG TACCATCATT 10860 

TTGAATATTC GCATGGTAAC GTTCGTTAGC AAATCTTGAT TCTCTTTCAT CTTCATGATC 10920 

ATGTGGATAA ACCATGTTTA AATAATAGTT GATTGCTGGT CGACATTTTG GACATCCACC 10980 

TTTATTTTTA AAGTTTAAAA CATGTCGAAC TTCTTTAGAT GTTTTTAAAC CTTTCGCTCT 11040 

TATTTGCGTT ACTATTTGAT CGCGTGTCAA ATCAGTACAA CCACATATAC CAGCAGGTTT 11100 

TGCGGCAACA AAGTCATCTC CTAAGGTGTG CTGCAATATT TGAGCAATTT GCGGTTTACA 11160 

TTTACCACAT GAATTCCCCG CTTTTGTTTT AGCCGTTACT TCTTCAACTG TTGTAAAGCC 11220 

ATTTTCCGTA ATCGCATTTA CTATAGTACC TTTATCAACA CCATTACAAC CACAAATTGT 11280 

TTCATCATCA GCCATATCAG CAATTGATAG CGATGCCTCT TCTCCACCTT TAGTAAGCAA 1134 0 

TGATACAAGT GTGTAATCTT CAGTGGATTC ACCTTTTTTC ATCATGTTAT AAAAGCGTGA 11400 

ACCATCATCG ATATCACCAT ATAGTACTGC ACCAACTACA TTACCGTCTT TTAAAAAGAT 114 60 

TTTTTTATAG TTATTATCAA CACTATTAAA TATTTCAATA CCTTTAATTT CTGCATTTTC 1152 0 

TACAATTTGA CCAGCACTAT ACAAGTCACA CCCAGAAACT TTTAATGACG TAAATGTTGT 11580 

TGATCCCTTG TATCCGTTCG TTTCTTTATT TGTTAAATGA TCAGCTAATA CTTTACCTTG 11640 

TTCATATAGT GGTGCAACGA GTCCATAAAC TTTGCCGTTA TGTTCTGCAC ATTCACCAAC 11700 

TGCATATACA TTGCTATCAC TTGTTTGCAT CACATCATTG ACAACAATAC CACGATTAAC 11760 

ATCTAGACCT GATTCTTTGG CTACTTCTGT GTATGGTCGT ATACCTACTG CCATAACAAC 11820 

TAAGTCTGCC GGAATCTCGC GTCCATCAGC CAATTTAACA CCCTCAACAT CATCTTCTCC 11880 

TAAGATTTCA GTTGTGTTGG CTTGCATTTC AAACTTCATA CCTTGCTTTT CTAGATCTGC 11940 

TTTAAGCATA TTTCCAGCTT TACGGTCTAG TTGCATTTCC ATCAACCATT CAGCTAAATG 12000 

TAACACCGTT ACTTCCATAC CTTGATCTAA TAAACCACGT GCACACTCTA AACCTAGTAA 12060 

TCCTCCACCA ATTACAATTG CTTTCTTTTT AGTCTTAGCA ATGTTCATCA TTTGTTCAGT 12120 
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GAATGCTTTA GAACCTGTCG CAAAAATCAA 
AGTAGTAACT GATTGATTTG CTCTATCTAC 
5 GATACCATGT TCCTCATACC ACTCATATGG 

ATTTTGTAAA ATATTTGAAA GCATGATGCG 
TACCGTAATA TCATATAAAT CGTTGGCGCG 

10 

CGCCATACCG TTACCAATCA TTACTAGTTT 
CCATAATATT TATTTCAAAA AAAGGTATTA 
GGAATCATTA AGCTTTCTAA TCTATCGTTA 
ATTGAAGGTG TGAAGTGTAT ATCTGTATTA 
TTGTTAACAA GTCTTCCGTC ATATAAAAAT 

20 TTCGAGATGC TTTCTAAATC ATGTGTAAAA 
GTCGGCTTGC TAATTTGCAA ATTTTGAGCG 
AACTTTCCAT TAATATTGCC GTGTGCAACA 

25 GCTAATGCGT CACAAATACG TTGTTCAATT 

GGCTCGCTTA CTTCTACCTT TATGTCTGGA 
GGTATATCCT TGAGATAATG CATTGCACTA 

30 

TCAACCCCAC TTTGAATCaA CGTCGTCaTT 
AAAAACGCAA TATCATAGTG ATGTATATCA 
AACGCTTGaT TCTGTCGTCC GTGCCTCATG 

35 

TTTACCAACC CTTTCACACG TATTGTATAC 
CATTATAATG TAAAATCAGG GAATTCCCTG 

40 TTTTCCCTTT TTGTTAAATC AAAAAAAGCG 
TTTGAGCAAG CATTAATATA TCGGTCGCTT 
TTGGCCTAAT ATTGTTTCGT CAAAGCGCTC 

45 TAAATCGCCA TCATCATTTT CATGTTCGCT 

TTTAAGTAAC CACGGATGCA ATCTTGCAGA 
CGTATCTCGC AAAAATGCTT CTTCAACATA 

50 

TTCATACTCA GGATTTGTCG CAAACCACCA 
ACTTCCCCAA GGATATCTAA CCGTAATCGT 

55 



TTTATCGTAT GATACTTCAA TACCATTTGC 12240 

TTCAATTACA GGATCATTTG TAATTAACTC 12300 

ATTCATAATT GTTTCTTCAA CTGTCATTTT 12360 

GTTATAGTTT GGATAAGGTT CTTTACCTAT 12420 

CTCTAATATT TCTTCGATTG TTCGAATGCC 12480 

TTGCTTTGCC ATAAAATATG CCCCTTTACT 12540 

ATTTTTCGTT AGTGCTTTTA TATTTTCATT 12600 

ATGATTTGCT TTAAAATTGG GTCGAAGTTA 12660 

ATAACCATGT CATTCATTTG CTGCTTCACT 12720 

AATGGTACGA CAATCAATTT TTGATACCGT 12780 

CTAATCTCTC CATATAGCGT TCTCGCATAT 12 840 

CATATTTGTA ACTCTTCGTG TGCCTTAGTA 12 $00 
ACCATAACTC CAACTTGTTG TTCGTCACCT * 12960 

AATCGTCTCA TTAAAGGATG TGTGCCAAGT 13020 

TACCGTCGTT TCATTTCATG AACGATATTC 13080 

AAGATTAGCA ATGGTACAAT TTTAAAATGG 13140 

ACCGTCTCTA AATCCtGATG CTCACTTTCt 13200 

TCTTTTACTA ATTCAGAAAT AAATGCTTCT 13260 

CCATGTGCAA CAATGATATT CCCATTCACA 13320 

CAAATCATTT TGTTTTTGTG AAAAGAATCA 13380 

ATGCCTGTAG TCATGCATAT TCCTTATACA 13440 

ACCGATATAT GAATCCCTAC TCAACATTTA 13500 

GTAGTGTATA TTATTATCTT AAAATGGTGG 13560 

GGGTATCAAT ACTTTGCGCA TGATCACACC 13620 

GTATATTTCA TAACCTCTTT TTTCATAAAT 13680 

TGTACCTAAA GTAACTGCCG CTGACTTTAA 13740 

AGTAAGTAAT TGGCTACCAT AGCCTTTCCC 13800 

GACAAAAGGA TAACCCGAAA TACTTTTCAC 13860 

AGATATAATT TCATCATCAA TTGTCATGAC 13920 
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CCAATCAATA CCTAGTTCTC TTAGAgGOGT AAATGCTTCA TGCATGAGTT CTTGCAATTT 14 04 0 
TTCTGCATCT T 14051 
(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1885 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



<xi> SEQUENCE DESCRIPTION: ! 

15 

TAATCCTCAA CTTnGATTAT ATGGCTTGGG 
CATTCATACA GTTCGCATGA CTATCATACA 

20 GTTACCAATC TCAAGATTAC TTCAATGGGG 
TGCTTATAAC GGTGAAGGTG AGTTTAAAGC 
GCATTCATGT AGAACACGTG CAGAATACAT 

25 GTTGAGTGAT GAAGAGATTG CAGAACTTTT 
TGATGAAGTA GAGAAATATG CGTTAGAACA 
TTGGAAGCTA GTCACGGGAC GTTCAAGGAG 

30 CAGGTTAGTT GAAGCGGGTT ATAAACCTGA 
TACGAATTTA GAAAAATTAA TCGGCAAAAA 
AGAAAAGCCG CAAGGTAAAT TAACACTTGC 

35 

GCAATCTGCT GAAGATGATT TTGACAAACT 
TGA^AGCAAA AGTATTAAAT AAAACTAAAG 
CACaTATTTT TGaACCTCAC AGTATGCAAG 

40 

TAATCATTCC TaAATCAGAT ACAAGTACGA 
CTAAAGAAGA AGGAAAAGTT AGTAAGTTTG 

45 CATTACGTGA TGGAGATACT GAAAGAGAAG 
TTAACGCATC AAGCAAACAA GCACCTGGTA 
ATTCTGGAAC TATTGTAAGT GGTGACTATA 

50 ACACAAATGG TAATAAGGGT ATCGCAGTTG 
GCGAACCTCT TGGCGGTGCA AGTGCAGCAG 

55 



>EQ ID NO: 104: 

CGCATATGAA CTGCTTAGTT TAGTGTATGA €0 

ACCTCGAATA GATAACTTTT CTACTGAAGA 120 

AACCGATTTT GTTAAACCCT TAGCCAGACT 180 

AGGTAGTCAT TGTAGATTCT GTAAGATAAA 240 

GCAAAATGTG CCTCAAAAGC CACCACATTT 300 

ATATAAACTG CCTGATATCA AAAAATGGGC 360 

AGCGAAAGAG AATGATAAAA CGTATCCAGG 420 

AGTGATAACT GATACAAAAG CAGTCCGAGA 480 

AGATATTACA GAAACCAAGT TACTTAGCAT 540 

AGCATTTTCT AAAATTGCAG AAGGCTTTAT 600 

TACCGAGTCT GATAAACGAC CAGCTATAAA 660 

ATAAAAATTA AAAAGGACGG TATATAAACA 720 

TGATTACAGG AAAAGTAAGA GCATCATATG 780 

AAGGGCAAGA AGCAAAGTAT TCAATCAGTT 840 

TAAAAGCCAT TGAACAAGCT ATAGAAGCTG 900 

GAGGCAAAGT TCCTGCAAAT CTGAAACTTC 960 

ATGATGTGAA TTATCAAGAC GCTTATTTTA 1020 

TTATTGACCA AAACAAAATT AGATTAACGG 1080 

TTAGAGCTTC AATCAATTTA TTTCCATTCA 1140 

GATTGAACAA CATTCAACTT GTAGAAAAAG 1200 

AAGATGATTT TGATGAATTA GACACTGATG 1260 
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TTGAGGTGTC AAGAATTTGA AATTTATGAA TATAGATATT GAAACATACA GCAGTAACGA 1380 

TATTTCGAAA TGTGGTGCCT ATAAATACAC AGAAGCTGAA GATTTCGAAA TTTTAATTAT 1440 

5 AGCTTATTCG ATAGATGGTG GAGCGATTAG TGCGATTGAC ATGACTAAAG TAGATAATGA 1500 

GCCTTTCCAC GCTGATTATG AGACGTTTAA AATTGCTCTA TTTGACCCTG CTGTAAAAAA 1560 

GTATGCATTC AATGCTAATT TCGAAAGAAC TTGTCTTGCT AAACATTTTA ATAAACAGAT 1620 

10 

GCCACCTGAA GAATGGATTT GCACAATGGT TAATTCAATG CGTATTGGCT TACCTGCTTC 1680 

GCTTGATAAA GTTGGAGAAG TTTTAAGACT ACAAAGCCAA AAAGATAAAG CAGGTAAAAA 1740 

TTTAATTCGT TATTTCTCTA TACCTTGTAA ACCAACAAAA GTTAATGGAG GAAGAACrAG 1800 

15 

AAACCTACCT GAACATGATC TTGAAAAAtG GCAACAATTT ATAGATTaCT GTATTCGAGA 1860 

TGTAGAAGTA GAAATGGCGA TTGCT 1885 
20 (2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2656 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 

25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

30 TAATCCTTAG TTCACTGnCA AATTTCAAAA CACCAGTTCC CTCTATCTGC ATCCATAGAA 60 

ACTGnATGTT TGTGTCAATA ACCGGATTAT ATTGTGATGn TGTTTGTAAC TCGATTAAGT 120 

TATCATCTTT CGAAAAATTA TCTACTACCA TTATTCAACC ACCTTTCCTT CGAATAAACT 180 

35 

CCATTTACCA ACkCCACCAG TACCAAAGTT TCTAACTAAA AATTGATGTG CAGACGGGAA 24 0 

GTTATTACGT CTTAATACTT GTGTTGTATT ACCTGGTGTA TTCGATTTTA CTTCTAATAT 300 

CCAACCTGCA ATACCTTTAA AGTCTTTAGG AAAATCAGTA AATCGGTTTG ATTCTTCAGT 360 

40 

AGTGATATAG AAATCTAAAC CAACGATTTT TAAATCTGAT AATTTTGTAA TACTCTTAGG 420 

GATATGTTCC CAATAACCGG CGTTTTGCGG GCAGAAATTC CATGCTCCGT TGTTTTTCTT 4 80 

45 ATTGAAAATG TCAATGACAC GTTCGAATTT AAGCATATTT CTACCTGTGC TGTTTCTGGt 540 

AAGTACTTGT CTTAGAGCAC CATTATAGTG TCCAGGCAGT ACATCCAAGA ACCACCCTGC 600 

ATCTCTAAAC GCTTTCGGTA ACGGGAAATC TAATGCATTT TGTGTGTCTT GaCGTATAGA 660 

50 TATAGTAATG ACCAACTTCC GTAATATCAC TTAGATATGC TGGGTTCTGT ATTGGTAACG 720 

GTTTAACACG TCCGCCTGAA TCAGTCATTG ATACTTGAGG TGCGATGTTT TTCAAGAATT 780 

55 
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TAGTTACCCC GATTAGAAGT 


GCTTTACGTC 


CTGTTTCTAG 


ATCGTAATAC 


ATATCTAGAC 


900 




CCTCAGCCTC TTGGAAATCT 


CCTTTAAAGT 


TGTTATTCAC 


ACCGCCTATA 


TCGATGCGAC 


960 


£ 


GTTTAAATAA CAATTCTTTC 


GTTTTGATAT 


CGAAGCCTTG 


TAAGTAGTTA 


GGGTTGGCTG 


1020 




TATTCGAATC ACCTGTATAC 


CAATATAAGA 


TACCTGCATC 


ATAAGTGATA 


CCTTGCATAG 


1080 


10 


GTTGTGTATC TGAAGTGTAT 


TCCATAGGTA 


TATCCATTTG 


ATACAATACT 


TTGTCTATAC 


1140 


CTTTATCAAT ATCGTCAGCA 


CTTCTAACCT 


CAACAAAGTT 


CAACGAATTC 


TTAAGTTGTC 


1200 




TTTCAGTGGG TTTATATTCA 


CGTCTAAAAA 


TCATTAAATT 


TTCTACCGGA 


TTATAAATCG 


1260 


15 


CTGACGTATA TCTGTCGTTA AATATATTCG GCATGACATC TTGCATTTCA TTACCATAAG 


1320 


TTATTTCTCC AGTTCTATAT 


TGGAAACGTA 


CAAACTTGTT 


GTTTTTGTTA 


CTGTCCAATA 


1380 




CAGCTGAATA AATCCATAAT 


TCTCCATCAA 


TGTATCTATA 


CGCATTGTGT 


GTACCGTGAC 


1440 


20 


CGCCGTTTTT AACAAGCAAT 


CTATCAATAA 


ATTGTCCGTT 


GGGCTTCAAT 


CTAGATAACA 


1500 




TGTAATGATT ACCTGGACGA 


GCTTGCGTCA 


TATAAATAAT 


TTTCGTTCTA 


GGGTCTACCC 


1560 




AAAATGATTG CATTACTGCA 


TTTGTATATG 


GCGATAAATC 


AGTGATAAAT 


TCCGGTTCTT 


1620 


25 


GCTCTTTTGG TTCGAATCGG 


TATTCTGTCG 


CTCGATATTC 


TTTATAGTGT 


TCATCTACAG 


1680 




CTTTCTCAAC CTTT*1 M 1AGTG 


AAAACATCTA 


GTGTTGAATA 


ATCATGATAC 


AAACGATCTT 


1740 




GCAATGTCTT ATGACCATAA 


CCTGTATTAT 


CAACGCGCGC 


GTCTTTTAcT 


TCGTTGATAC 


1800 


30 


CGTCGCCGTT ATGACCTAGT ACCATGTTGC 


TAAATCGACC 


GTTTAAATAT 


GTTAAAAAGT 


1860 




CAGAGACGTT ACTTGTAACA 


TTTAAATGTT 


CATACTTTAT 


TTGTTCTCCA 


TCATGTGCGA 


1920 


35 


ATACCTCTTT ATTTCTGTGG 


TATTCAAGAG 


AGAAATTAAA 


ATCCGTCAGC 


ATGTCTGAAA 


1980 


TAAGTTTAAA GTTATACTCA 


TTTTCATCTA 


CATATCTGTA 


GTCAAAGACT 


CTACTTAAAT 


2040 




CTGTAATTAG TTTATTACTC 


ATGTTTTCCT 


CCTTTACTAT 


CCATAAAACT 


GATmATAATT 


2100 


40 


TTTAATAAGC TCATACATAA TAACTTCATG ACCTCTTTCA TTAGGATGTA ATCCATCAGG 


2160 


CATGCTAGAT TTTCTAAATG 


CTGGATTATA 


TGGTTTGAAA 


TAATCTGTGT GATAAGCATC 


2220 




ATATACTGGT ACATCCAATT 


CACTACAAGC 


CAATATCTGA 


GCATTGACAT 


AATCCTCTAA 


2280 


45 


AGTTAACCCT AGTTTGTTTT 


TGTCCGTATC 


TTTACGGCGT ATCGTTGTAC 


CACTCATAGG 


2340 




GCATTGCCTA GTAGCTGTCA 


TTACAAGTAT 


TTTTGAAGCT 


GGATTATTTT 


TCCTGATAAC 


2400 




TTCAATTGCA GAACAAAAGG 


CGCCGTAAAA 


CGTTTTAGTG 


TCGGTTTTAT 


CAGTGCCTAT 


2460 


50 


CGGTACGCCT GCCCAATAAC 


CATGTAACCA 


GTCATCATCT 


GTACCTTGTA 


ATATGATTAG 


2520 




GTCTCCTCTT ATTTGCTCTG 


CTTGTCTaTA 


AATGCTGTTT 


TCTaCCGCTT 


CTTTACCTAT 


2580 
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CTTGCCTAAC ATTTCT 2656 
- (2) INFORMATION FOR SEQ ID NO: 106: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 854 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

1S AAAATGAGGG TTCTAGCGGA AATTACCAAA AGCGTGGTTC ATACTATGGG CAGCGTAATC 60 
GTATTTCAAA AGAAAAAACA CCTAAATGGT TAGaAAATAG AGATAAACCT AGTGAAGAAG 120 
ATTCGGCTAA AGATAATAGC GTAGATGATC AACAATTAGA GCAAGATCGA CAAGCATTTC 1B0 

20 TAGATAAATT ATCTAAAAAA TGGGAGGAGG ACAGTCAATA ATGAAGCAAT TTAAAAGTAT 240 
AATTAACACG TCGCAGGACT TTGAAAAAAG AATAGAAAAG ATAAAnCAGA AGTAATCAAT 300 
GACCCAGATG TTAAGCAATT TTTGGAAGCG CATCGAGCTG AATTmACGAA TGCTATGATT 360 

25 GATGAAGACT TAAATGTGTT ACAAGAGTAT AAAGATCAAC AAAAACATTA TGACGGTCAT 420 

AAATTTGCTG ATTGTCCAAA TTTCGTAAAG GGG CATGTGC CTGAGTTATA TGTTGATAAT 480 

AACCGAATTA AAATACGCTA TTTACAATGC CCATGTAAAA TCAAGTACGA CGAAGAACGC 540 

30 

TTTGAAGCTG AGCTAATTAC ATCTCATCAT ATGCAACGAG ATACTTTAAA TGCCAAATTG 600 

AAAGATATTT ATATGAATCA TCGAGACCGT CTTGATGTAG CTATGGCAGC AGATGATATT 660 

TGTACAGCAA TAACTAATGG GGAACAAGTG AAAGGCCTTT ACCTTTATGG TCCATTTGGG 720 

35 

ACAGGTAAAT CTTTTATTCT AGGTGCAATT GCGAATCAGC TCAAATCTAA GAAGGTACGT 780 

TCGACAATTA TTTATTTACC GGAATTTATT AGAACATTAA AAGGTGGCTT TAAAGATGGT 840 

40 TCTTTTGAAA AGAAATTACA TCGCGTAAGA GAAGCAAACA TTTTAATGCT TGATGATATT 900 

GGGGCTGAAG AAGTGACTCC ATGGGTGAGA GATGAGGTAA TTGGACCTTT GCTACATTAT 960 

CGAATGGTTC ATGAATTACC AACATTCTTT AGTTCTAATT TTGACTATAG TGAATTGGAA 1020 

45 ' CATCATTTAG CGATGACTCG TGATGGTGAA GAGAAGACTA AAGCAGCACG TATTATTGAA 1080 

CGTGTCAAAT CTTTGTCAAC ACCATACTTT TTATCAGGAG AAAATTTCAG AAACAATTGA 1140 

ATTTTAAAAT GATTGGTGTA TAATGAATAC AAATCTAAAT CGTTTAAATG ATTGAAGACA 1200 

50 _ 

AGATGATCTA ATCAATATTA CACAGAAAGC CATTGTTTGA TGAGAATATG GTTAATAAAT 1260 

TAGATGATTA CTACTTCATT TATGGTATTT GTAATGAATA CCCGGATCAA GACCGTTATC 1320 
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10 



15 



CTCGTCCCTT GTATAGGGGC GGGATTTTTT GTTTTTTTCA GACATAAATG TTTGTTGGTG 144 0 

TCATAAATTC CCTGTTTATT GTTAATAGGT TTAATGTTAA AACGATGATT GTTGTTCAAT 1500 

TTTTTAACGA GGTCAGATAA AAGTATTTAT AAAGCAAATA GGAGGGTTTA ACATGGAACA 1560 

AATTAATATT CAATTTCCAG ATGGTAATAA AAAGGCGTTT GATAAAGGTA CTACTACTGA 1620 

AGATATAGCA CAATCAATTA GTCCTGGATT ACGTAAAAAA GCTGTTGCCG GCAAATTTAA 1680 

CGGGCAACTT GTAGATTTAA CTAAACCGCT TGAAACTGAT GGATCAATTG AAATTGTGAC 1740 

ACCAGGTAGT GAAGAagcGT TAGAGGTATT ACGTCATTCT ACTGCACATT TAATGGCACA 1800 

CGCGATTAAA AGGTTATATG GTAATGTTAA ATTTGGTGTA GGTCCTGTAA TAGAAGGTGG 1860 

ATTCTACTAT GACTTCGACA TTGACCAAAA CATCTCATCT GATGACTTTG AACAAATTGA 1920 

AAAAACAATG AAACAAATCG TTAACGAAAA TATGAAAATC GAACGAAAAG TGGTTTCACG 1980 

20 AGATGAAGTG AAAGAGTTAT TCAGCAATGA TGAATACAAA TTAGAATTAA TCGACGCGAT 204 0 

TCCTGAAGAT GAAAATGTAA CATTATATAG TCAAGGTGAT TTTACTGATT TATGTCGTGG 2100 

AGTTCACGTT CCATCAACAG CTAAAATTAA AGAGTTTAAA CTATTATCTA CAGCAGGTGC 2160 

25 ATACTGGCGT GGAGATAGTA ACAACAAAAT GTTACAACGT ATATACGGTA CTGCTTTCTT 2220 

TGATAAAAAA GAATTGAAAG CACATTTACA AATGTTAGAA GAGCGTAAAG AACGTGATCA 2280 

TCGTAAAATT GGTAAAGAGT TAGAACTATT CACAAATAGC CAATTAGTTG GTGCTGGTTT 2340 

GCCATTATGG TTACCTAACG GTGCAACAAT TAGACGTGAA ATTGAACGTT ACATTGTTGA 24 00 

TAAAGAAGTT AGCATGGGAT ATGACCACGT TTATACACCA GTACTTGCTA ATGTTGATTT 2460 

ATACAAAACA TCTGGTCACT GGGATCACTA TCAAGAAGAT ATGTTCCCAC CAATGCAGTT 2520 

AGATGAAACT GAATCTATGG TATTACGTCC AATGAACTGT CCACATCATA TGATGATTTA 2580 

TGCGAATAAA CCACATTCAT ATCGTGAATT ACCTATCCGT ATCGCTGAGC TAGGAACGAT 264 0 

GCATAGATAT GAAGCAAGTG GTGCTGTATC AGGATTACAA CGTGTTCGTG GTATGACTTT 2700 

AAATGATTCA CATATCTTTG TTCGACCTGA TCAAATTAAA GAAGAATTCA AACGCGTTGT 2760 

AAACATGATT ATTGATGTGT ATAAAGACTT TGGTTTCGAG GATTATAGCT TTAGATTAAG 2820 

45 TTATAGAGAC CCTGAAGATA AAGAAAAGTA CTTTGATGAT GATGATATGT GGAATAAAGC 2880 

TGAAAATATG CTTAAAGAGG CAGCGGATGA GCTTGGCTTA TCGTACGAnG AAgCGATTGG 294 0 

TGAAgCGGCA TTCTATGGTC CGAAACTAGA TGTTCAAGTT AAAACAGCGA TGGGTAAAGA 3000 

50 AGAGACATTA TCAACAGCAC AACTTGATTT CTTATTACCA GAACGTTTTG ATTTAACTTA 3060 

TATTGGTCAA GATGGTGAAC ATCATCGTCC AGTTGTTATT CATCGTGGTG TTGTATCAAC 3120 

55 
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AGCGCCAAAA CAAGTTCAAA TCATTCCAGT TAACGTTGAT TTACATTATG ATTATGCGCG 324 0 

CCAATTACAA GATGAATTGA AATCTCAAGG CGTTCGTGTA AGTATTGATG ACCGTAATGA 3300 

5 

AAAAATGGGT TATAAAATCA GAGAAGCTCA AATGCAAAAA ATACCTTATC AAATCGTAGT 33 60 

TGGGGATAAG GAAGTTGAAA ATAATCAAGT GAATGTGCGT CAATATGGAT CGCAAGACCA 3420 

AGAAACAGTT GAAAAAGATG AATTTATCTG GAATCTAGTT GATGAAATTC GTTTGAAAAA 3480 

10 

ACATAGATAG ACAGTTGTCG CAATAAAATG CTTTAAAACT TTTATTGCGT ATCAAGTTTT 3540 

ACAGGGTTGA TTATGCGTGA TGAATCCTGT ATATTACAAG TTAGTTAAAA TATTAAATTG 3600 

15 AGTTAGAGGT TGCATGTTTA ATTAGTAACT TGTCAGAAGT ATTTATGGTA CATAAGTTGA 3660 

ACAAGTGAAA GGTAAAGATG CCGAAATAGA TATAAACCAT AAATTATATC TATTGGGACA 3720 

GTTTTCGAAT AGGAACTGTA CTGTCACAGA ATGTGATGTG CTACCTTATA TAGATAATTG 3780 

20 CCAAAGTGGT TGCATATCTT AAAGGTATGT AGCCACTTTT TTACTTTTAA TATCACTATG 3840 

TTCTGTAAAA AAGGGTATGA AAGTGAATAA AGGTTATTTA TTTCTTGGCC TCTAAAACAT 3900 

GGAAAGGGAG CTTATATGTC AAAAGTTCAA AATGAAAGTA ACAATGTTGT CAAAAGGGGA 3960 

25 CTTAAAGATC GTCATATTTC TATGATTGCG ATTGGGGGTT GTATTGGTAC AGGTTTATTT 4 020 

GTAACTTCTG GTGGAGCAAT TCATGATGCA GGTGCTTTGG GTGCATTAAT AGGATACGCA 4 080 

ATTATCGGAA TAATGGTATT TTTCTTAATG ACGTCACTTG GCGAAATGGC TACGTATTTG 414 0 

30 

CCAGTATCAG GTTCATTTAG TACATATGCT ACAAGATTTG TTGATCCATC TTTAGGGTTT 4200 

GCGCTTGGTT GGAACTATTG GTTTAACTGG GTAGTGACTG TAGCAGCAGA TATTACGATT 4260 

GCAGCACAAG TCATTCAATA TTGGACACCA TTGCAAGGCA TACCCGCTTG GGCATGGAGT 4320 

35 

GCGTTGTTGT TAGTTATAAT TTTTAGTCTG AATTCGTTAT CAGTTCGCGT CTATGGTGAA 4380 

AGTQAATACT GGTTGGCATT GATAAAAGTG GTTACAGTTA TTGTTTTCAT TGCAATTGGT 4440 

TTATTAACGA TTGTCGGAAT CATGGGTGGT CATGTTGTAG GATTCGAAAT ATTTAATAAA 4500 

40 

GGTGAAGGTC CAATTCTTGG TGGCAACTTA GGAGG AAGTT TGTTATCAAT TCTAGGTGTA 4560 

TTCTTAATCG CTGGTTTCTC ATTCCAAGGT ACTGAGTTAA TTGGTATTAC GGCTGGTGAA 4620 

45 TCAGAAAATC CTGAACGTGC TGTGCCGAAA GCAATTAAAC AAGTATTCTG GAGAATTTTA 4680 

TTATTTTACA TTTTAGCCAT TTTTGTTATC GGTATGTTAA TTCCTTATGA TAGTAGTGCA 4740 

TTAATGGGGG GTAGTGATAA TGTAGCAACG TCTCCATTCA CATTAGTGTT TAAAAATGCT 4 800 

50 GGATTTGCGT TTGCAGCATC ATTTATGAAT GCAGTCATTT TAACGTCTGT GTTA 4 854 

(2) INFORMATION FOR SEQ ID NO: 107: 
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(A) LENGTH: 2488 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

10 ATCAAAAATT GATTGTTTTC nATTTTTTGT TTCAGCGCGG GATCTTTTAC GTCTTTTGTG 60 

AAAACGaTTT. TATTATTAAC TACTTTTACT GGATAACTTT TGTATGTCGA GTCAGTAGCA 120 

TTTTTTCTAT CGTTTGTAGT TGTGTCATAT TCACCAgTTA TTTTATGTGT GTTCTTATCT 180 

ACCTTTAACA ACATACGGTC TTCTTTTAAA AGCTCATCTG ATCCAACAAC TGAATAAGAG 240 

GATTCTATAT ACCATGTGTC TTGATCATTA TTTTCATAAT GGGGATTATC GTGACCATCA 300 

ATTTCATAAA GCGTTTCTAA GTTTTTAATA GGATACGTAC TTAGTACTTT TTTAAGACCA 360 

TCTTTCAAAT GAATTTGTTC CCACTTCATT GCCAAAAACA TATCGCCACT GACTACAATT 420 

GAAATAATAA TAATTGCTGC TAAGTTTAAC CAGAAAATTT TATGTGCTTT CATACATTCC 480 

CACCGTTTCT CAAAATACTT CATTAACACT ATAATAATAT ATTTTGAAAA ATATTTACAT 540 

CAGTATTAAA GTGAATATCA AATTTTAAAT TTATGAAAAT AATAGATATT TATAAAAAGC 600 

GGAAAAGAGA TACAATAAAA AACTGCATGA CGTTTGAGAC GTCACACAGT GTAACTAAAA 660 

ATTTAAAAAG TTGTTGCTAA TTTTTCAGCA TTATTAATAC TAGTTGCTTT AATTTCTTCA 720 

GTCTTATGAG GTTCAGCATT GTGTCCTTCA ATAATGATTG TTTCATATGA TGGCACACCT 780 

AAGAATGTCA TAATTGTTCT TAAATAACGG TCACCCATTT CAAAATCAGC AGCAGGTCCT 84 0 

35 TCAGTATAAT ATCCACCACG TGATTGAATG TGTAATACTT TTTTGTCAGT TAGTAAACCT 900 

TGTGGTCCTT CAGCAGAATA TTTAAAAGTT TTACCTGCAA TTGAAATAGC ATCAATATAT 960 

GCTTTAACTA CAGGTGGGAA AGAAAGGTTC CACATAGGCG TTACAAATAC ATATTTATCT 1020 

40 GCACTTAAAA ATTCTTCTAA AATGTCACTC AATCTTGAAA CTTTCATTTG TTCATCATCA 1080 

GTTAACGTTT CGCCATTACT CATTTTTCCC CAACCAGTTA ATACATCTTT GTCAATAACT 1140 

GGAATATAAG TTTCArATAA ATCAATATGT TTCACTTCAT CATCAGGATG TTGTTGTTGA 1200 

TATGTTTCGA TAAATGCTTT ACCAGCCGCC ATAGAATTTG ATACCAGTTC ATTAAAAGGG 1260 

TGTGCTGTAA TATATAATAC TTTTGCCATT TGAAAATTCT CCTCTGkTTC TGTTATTTTC 1320 

TTAAGTATAA TTATTATACT CGATATAAAA TTTAATATCA ATCAAAATAT TCAAATTACC 13 80 

ATCATTTTCT TCATCTATAT nTGGCAGTAC TACTAAAGTA TGAGTGCATT TAATTATGAa 1440 

ATAGTTGATT TaGAATAtAT ACTTAATACC CAAAATATAT GAAGGATGGA TGCCACTATG 1500 
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ATTATTTATA TAGATGACAT TCAAAAATGG TTTAACCAAT ATACCGATAA ATTGACACAA 1620 

AATCATAAAG GACAAGGACA CTCAAAATGG GAAGACTTTT TTAGAGGGAG TCGGATTACT 1680 

GAGACTTTTG GTAAATATCA ACATTCACCA TTTGATGGTA AGCATTATGG CATTGATTTT 1740 

GCATTGCCAA AAGGTACACC AATTAAAGCG CCGACGAATG GTAAAGTAAC ACGTATCTTT 1800 

AATAATGAAT TGGGCGGCAA GGTATTACAG ATTGCCGAAG ACAATGGAGA ATATCACCAG 1860 

TGGTATCTAC ACTTAGACAA ATATAATGTC AAAGTAGGTG ATCGAGTCAA AGCAGGTGAT 1920 

ATTATTGCAT ATTCAGGCAA TACAGGTATA CAAACGACAG GCGCACATTT ACATTTTCAA 1980 

15 AGAATGAAGG GTGGCGTAGG TAATGCATAT GCAGAAGATC CAAAACCGTT TATCGATCAG 204 0 

TTACCTGATG GGGAACGTAG CCTATATGAT TTGTAGTTAT AGAAGGGTGC CCGCAGTCTA 2100 

AAAAATTAAG CAATCATTGT GTGAGTATGA TACTTACATA ATGGTTGCTT TTTTCAATGA .2160 
AAATCGTAAT GCTAAGTCAT ACTTGTTTGA TTTAGATATT ACTTAAAATG TAAGACAAGG . 2220 

TTGTTAGCAT TGGCAGTGAA ATATCGCACA TAAAAAACAT TATTGTCACA CTAGAAAATA 2280 

GTTGTGCACT ATATCAATTT TCTGTATAAA AGTTTAATTC TGACAGTAAT GTAAACGTTT 234 0 

ACAATTTATG ATTGACATTA ATAATGACTG AATATATGAT TTATGTAAGT ATTTGTGCAA 2400 

CGTTTTCACA AAGTGTATTG CACaAyCAAA CTGtAAACaA aGTATGGGGg GCCATAACAT 2460 

GGCAGAACTA AGTTAGAGCn TATTAAAA 2488 
(2) INFORMATION FOR SEQ ID NO: 108: 



20 



25 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 093 base pairs 
35 (b) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
TTTTCTTTAT TTCAAmCTGT ATATTaATGA TGTCACTTCA TTTGATACGA TTCTTGATAA 60 
CCTATTCAAA ATTCCGCCAA ATAACATAAA TATTATATAA ATGCCGATAC TTTTAATCAT 120 

45 

TTTCTACTTT TTCTTCGATA CGGAAACTTG TTTTCGAATT GAACACTTCA CCAGCTTTTA 180 
AAATTGACGG TGCTTTTTCA CCATATAAAT TAATATCATT TGGTAAAAAT TGTGTTTCTA 24 0 

50 ATGTAAAGCC AGAATGTGGT TTATAAATAT TAAATGGACT ATCCCACTCA TCAGGCTGGT 300 
TAAAAGTAAA GAACACAACA TGAGGCATAT CTGTATCGAC CTCTAACATA AATTCATGAT 360 
TTTCAACATA CATTTTATGT TCACCAACTG TAAATGGGTG ATCGAGACCA CCAAAACGTG 420 
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TATCTTCAAA 


CACTTCATGT 


AAATCTAGAA 


TATCACCTGT 


AACAATATTT 


CGCTCATCTA 


540 




ATACATACAT ATCTAATTGA 


TTACTTGAAA 


TGCGATGATT 


ATCAACGACA 


TTATTATCTC 


600 


5 


GATTCAAATT 


GAAGTACACA 


TGATTCGTAG 


GACTAAACAA 


TGTGTCTTCT 


GATGCAACTG 


660 




CTTCGTATTC 


AATCGACCAT 


TGGTGATCCG 


CATCATAAAT 


ATGTGTAATC 


GTCACATCGA 


720 


10 


TATCACCCGG 


GAAATGATCA 


TCAGCTGATT 


TCAACACCGT 


CTTAAATATA 


ACTTTAATTT 


780 




GAGCAATTTC 


ATTT CTAATT 


TCATAATCAA ATAACTTATT 


GTCCAAACCA 


TGACATCCAC 


840 




CATGTAAATG 


ATGTTCACCG 


*F^P^^^P^P^T^Hf^^^^P 


CTAACTGATA 


TTCTTTACCT 


TTCAACTTAA 


900 


15 


ATTTAGCATT 


ATCAATTCTA 




TTC CTATAGA 


AGCACCAAAT 


TTAAAAGGAT 


960 




TACTATGATa 


AAATTCATCC 


G C 1 1 CZAACAA 


CATTTCCAAG 


AACAATATTA 


TTATCATGAT 


1020 




ATTTCCAAGA 




CjCTCCATAAT 


TCGTAAAAAT 


AATTTTAGTT 


TCATCATTAT 


1080 


20 


PAATTTTGAT 

^p^v^ x x x x W% x 


TAAATCTAPA 
x fvvi x t x nwi 




viGTGCTCAAC 


TTCAACTATC 


ATTTTTACTT 


1140 




CTCCPJTTCTA 

\v X V* V* V* X X w X ^1 


AC PAPAAGTG 


1 X LAAbt I V- J. 


(j (JTGGGTAG C 


AACATTACTA 


AAACACCTAC 


1200 


25 


AATACAAATG 


ATTGPAPPGA 




TTTATCTGG C 


ATTTGTTTAT 


CTACGACCAT 


1260 


CGCAAAAATC 


AAAPJTPJVTGA 


1 VoM 1 Ann 1 At 




GCTGCATATA 


CTCTTCCGAA 


1320 




TGATGGAAAT 


GATTGAAATG 


TPGPAATGAP 

X CVJWW X VJAV 


ntmiillAHW 


ATGAGTATCG 


CACCGCCTAT 


1380 


30 


TAGCCCAACA 


AGTGAAGACT 


GTPCTTPPPT 


AAGPPAPAGP 


CAAATCAGGT ATCCCCCACC 


1440 




TATTTCACAT 


AAGCCAGCTA 


ATATAAATAT 


AAAAATCGGA 


TATAACATGA 


AATCACTCCA 


1500 




TCACACATTT 


GCTATCAATA 


ATCTATCGGC 


TACATATCAT 
x n. v>*n x n x cn x 


TTGTTTACAT 


TTCTTCTTAC 


1560 


35 


TTCACATTCC 


CATTTTAAAA 


AGTTCGTTTT 


CACATTCATA 


TTGTACACTT 


TTTTAGACAT 


1620 




TATTCTATAG 


CTAAATATAA 


AAAAATAAGA 


GTAACACGCT 


TTCATCATCA 


TTTTATATGA 


1680 


40 


TAAATGTGTG 


TCACTCTCAT 


CAATTTTATT 


TTTTAAATAC 


ACGTTTCATT 


GAATTAAATA 


1740 


AGCCACGTTC 


AAATGTAAGT 


ACTGAATCTT 


TATATGTTTT 


AATTGCAATC 


CATATCAAGA 


1800 




CAGCTACCAT 


TACAATTGAG 


ATTAAAGAAC 


TTAAGATGAC 


CTCATATATT 


TGAAGCCCTG 


1860 


45 


AAGTTTGAGC 


GCGTACAACT 


AATTGAAATG 


GCGCTAAAAA 


CGGAATATAA 


CTTGTGATTA 


1920 




AAGCAAGTTG 


TCCATCAGGA 


TTATTTATCG 


TGAATATCGC 


GATATAAAAT 


GCAATCATAC 


1980 




CAAGTAATGT 


CAGTGGCATC 


AAAGATTGAT 


TTAAATCTTC 


TATTCTAGAT 


GTTAATGATC 


2040 


50 


CGAGGATGGC 


TGCAAGTAAT 


ACATACGCCG 


TAATTCCAAC 


AATACTACTT ATAATTCCGA 


2100 




CAATAATAAT 


TTGCCAAGAC 


AATTGATTCA 


TTTCCACGTT 


AAAACCTTGT 


AGCAAGTCTT 


2160 




TTAAGTCAAA 


GGCAAAAATG 


CATATAACTG 


CCATCAATAC 


AATTAAAATA ATCTGAGTCA 


2220 
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TAATAATCAT TTCAATGACA CGCGATGTTT 
ATGCATAATT TAAAACAATG AAGAACATTA 
5 TGAAAATCTT TTGTCCTTCT GATACTTTAT 

CAACTTTACT TTGTGCTTGT AATTTTTGTA 
CTACCATATT TGTTTGAATA GCTGTAAGCA 

10 

TTACTCGCTT CTCACTAATG ATTGTCCCTT 
TATAAGCTTT ATCAAGTTTA TGTTTTTTTA 

15 TAGTAAACTT AGCATCACTA TGAAATGTAT 

GTTCATTCGG TGCTGCTACA CCAATTTTAT 
TATCAATGTT AGATAGGCCA ATCATTAAGG 

20 ATTTAGCTTT AATTTTTTTG ATATATGTCA 

TTGCCACCAA CCTTCTCAAT GAATATATCT 
TTAACATAAC CTTGATGTGC CACAACTTGA 

25 

ATCGTCAACT GAAGACCTTG CTTCATGTTT 
AAATCTGGTA GTGTTGTTTC TGATTCAATG 
ACATGATTGA TATCACCAGA AACAACAAGT 

30 

CATAATTCTT CAACATGCTC CATACGGTGA 
TTTAAGTCTT TAACTGCTTc TTTTAATAAC 
35 GGCTCATCTA ATATTAGTAA TTCTGGTTTA 

TGTTGATTCC CTTTTGATAG ACTATCAATT 
CGCTCAAGCC AATACGATAT TTGCTGTTGT 

40 

GCCAAATATT TCAATTCTTC TTCAACTGTC 
AAATAACCAA TACGATTGTA CATTGTTTTA 
CCTTCAGTTG GTTCACTTAA GCCTAAAATC 

45 

TTTCTTCCTA GAAAACCTAA CATTTTACCT 
GCCGTCATCT TGCCAAAACG TTTCGTAACA 
50 CTAAAAAnAT ATGTATTTAT CTTAATATAA 

TAAAATGAAT TTATTTTTAA AATTTCTGAA 
ATGTTAAGTA TCATTAGCAC TAGATATGTT 
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TCTCACTAGC AATTTCCATA GCTATTTGAG 2340 

GAAAGATAAT GCCATmaGcT AAAGCATAGT 2400 

CGACTTCATC ATTAGAAATC ACCTTATTAT 2460 

AGTCTTCTTT GTTGATATTT AATTCCCCGG 2520 

GTGCTTGTAC TTTTTGTGAA TCTTCATGAC 2580 

GTAACGTGCG ATTTTGATTC ACCTTGATAA 2640 

CTTCTTTTTC AGCATCTTCT ATAGAAACTT 2700 

TCGCCTGTTG CTTGAAAACC TTATAGATTT 2760 

CTGGACCATC ATCAAACATG TTAATAATCT 2820 

CAGCAATAAT AATCATAAAA ATTACAAATG 2880 

AAGTAAATGT CGCCCAAAAC TTATGCATCC 2940 

TGTAATGATG GTTCTACAAC TTGGAATCGT 3000 

TAAATATCTT TGGCTACGTC TTCATTCTCA 3060 

TCACTATGAA TGATGCCTCT AATGTTTGTT 3120 

ACAACTTTCT TGTTACCATT AGATGCACGT 3180 

TGACCTTTAT CTAAAATACA AACATCATCA 3240 

GAACTATAAA CGATTGTACT GCCCCAATCA 3300 

TCAACATTAA CTGGGTCTAG ACCACTGAAA 3360 

TGTAACATAC TTGCTAACAG CTGAATTTTT 3420 

CGTTTTTTGC GGTTTTCAGT AATATCAAAA 3480 

ATTTCTGTTT TTGACATTCC CTTTAAAGTT 354 0 

AATTTCCCAT GTAAACCGCG TTCTTCCGGT 3600 

TCTAGTTTTT TACCGTTATA CGTrrTGTGT 3660 

ATACGAAATG TCGTTGTTTT ACmTGCACCA 3720 

GATTCTAACT TTAATGAAAT ATCATTTACT 3780 

TGTTCAATTA CAAGTCCCAT ACTTTGCCTC 384 0 

CATTTCCATT CTCTATAAAT GCAATATTTT 3 900 

ATTGAAAAAT TTAAATAGTG CCATTTTTGC 3960 

TTTTCCATGC CTTTATTGCC TTATTTGTAA 4 020 
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CTTnCCGGTG TTT 4093 
<2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17846 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

75 TGCCAAACTA CCTTTTGACA GTCGTTGCTG TACTTCAGGA TGATCAATCA CATATnTTAC 60 

TTTATCAAAT AGGGCATCTT CATCATTTTT AGTAATTAAA TAACCATTGA AATCTGAAGT 120 

AATCAGTTCG TTAGGTCCAT ATTTAATATC ATAACTAATA ACTGGAACAC CATGTGCTAA 180 

AGATTCAAGT AGCGCTAAAG AGAAACCTTC CATGTTACTT GTTATTAAAC TCAAATAGGC 240 

ATCGCTATAT TCTTGGTCTA GATTGCTTAA AAAGCCGCGT AAGTAAACAT GATTTTCCAA 300 

TCCATATTTT TGTATCAATT CATTTAATTT TTTACTTTCA GAaCCAAAAC CATACATATG 360 

AaGCTCTATT TTTGGGACAT ACGATACTAA GCGTTTAATT AATTCAATTT GTTGATGTAA 420 

TTGTTTTTCA GGTGAATAAC GAGCAACGGA AATTAATTTA ACACTGCGCT GATCTAATGT 480 

TTGGACTGGT GTATCAATTG TTTCACTATA GCCGACAGGA ATATTAACAA CTGGAATAGT 54 0 

ATGGTTAATA CGTTTTTCAA CATCTAATTT TTGCTGCTCA GTAGAAACGA TAATTGCACG 600 

ATATCGAGAT AAATTTTCAA ACATCGCTTT ATATACATTT TTAAATGGCG ATGAATCTAA 660 

35 TGCATCAATA TTTTTAATGT GTGTACTGTG AAGCACAGCT ACTACTGGGA TTGACTCAGG 720 

CGTTAAGTTG AAAATAGGTG CTGTGTACAC ATTACGATCA CTGAAAAATA AATCCCCATG 780 

TTGATATAGT TGTTTAATGA AAAATGCGCC TAATTCCGTT TCATTATTAA AGAAATATTG 840 

TTTGTTAGCA TAGTAAACAA TAATTTTTTG TACTTCTGGT TTGCCATCCT TGTAAGAAAA 900 

ATACTTTTCT AATTTTGTGT CACCTTCTGG ATTATAGAAA AATTCACATA ATGTTTGTTG 960 

TTTATCAACA AGAATCCTAC TACAACTTAA AAAGCCACGC ACATCATAAA AATCACGTTT 1020 

TACTTtTCGT CTTTGACTAT CAAAATGATT TACATAATCT AATATACGAT ATTTAGGATC 1080 

TTGAAAATGG GCATACATTA AGAAACGCTC TTGATCATAT ATTCTAAAGT CATGACTATT 1140 

SO TTCAACATGT TTTAAAGTAT AATGACATTC ATCAGTCCAA TACGACAACC AGTCAAATGG 1200 

TTCATTGCGT TCTAAATATG TTGCTTCTTG GAAGAAATCA TACATATTAA TATAGTCAGA 1260 

ACTAGTAATA TAATTTTGGG CATTTCTATA TAAATATCTA TTCCATGACA GAAATACACA 1320 
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CCCAGTTAAA 


TTAACACCTA 


AACTATTACC 


TACAAAATAA 


TTCATTTACA 


ACACCACTTA 


1440 


TATCTATTTT 


TTATAATTAT 


ATCACATAAT 


ATTTAATTAC 


TTCTTTTAAC 


TGGAAGATGT 


1500 


GTTTATTTAT 


AAAACAACAA ATTTTGATAT 


TTATAATGAT 


AGTAGTTATT 


CAATCAcTAC 


1560 


GACCcAATAT ATCATkGTAG AGCTTAGGAT ATTGATTTAT GACTCAGGCA 


CATCAAATGa 


1620 


GAgGATTTAT AAArGAGATA TACAACTCTA GAAGGTATAA 


TAAAAACGCG 


CAACTAATGT 


1680 


TACGCGTTTR 


AATTAfiTVAT 
nMl Innl 




TTGCGATACT TTAATTTAGC GAAAgcATCA 


1740 


TGTTGATGGA 

X vj X X un X UUn 






TCGATATCGA 


AACCGTCTAA 


CCAATCAAAT 


1800 


x i_nm. x £v\\j x 


rT^ClCClfl/** ART 
I* ^ v UVJ^V-AA 1 


1 AAACGAA I L 


AAGTCTTCGA 


CAAAACGTGG 


ATTTTGATAT 


1860 


g c 7l ccir w rt Hf rr' 


1 ^-HL-HL-o 1 1 J. 




CGTTTTAAAA 


TAGGGTATAG 


AATTGAACTT 


1920 






TAAAATTTTA 


TTTTTATAGT 


CATCAACTAT 


GTCTTGATCT 


1980 


1 1 A 1 


A I\j 11 11 AAL 


AGTGACAACA 


CCACGTTGGT 


TGTGCGCTGA 


ATACTCACTT 


2040 


/IX X 1^4 1 X lv 


AACAAGGGCA 


TAGCGTTGTG 


ACAGTTGCTT 


CAATAGTAAG 


TTCTTTACGT 


2100 


VJX>U1V_X 1 X>\x 


CACCGTCAAT 


TGCTAATCCA 


TAAGTGACAT 


CGGCATTACC 


AACTGCTTTA 


2160 


nl/iii 1 x \j IsjVj 


TTGGACTATA 


GCGATCAAAG 


AACCATTT C C 


CAGAAACATC 


AACGCCTGCC 


2220 


GPATTTTYVrT 


TCATATTCGT 


TTGTAAAGTG 


CGTAACACCT 


GATAAAGTGT 


ATTAAATTCA 


2280 


AGTTCAATAC 


CATTATCATA 


GTGCTTTTCA 


ACACTTTCGA 


TTATACGG CT 


CATATTAATA 


2340 


CCTTTTTCGT 


CTTTTGTTAA 


ACTTGTTGAA 


AAACTAAATG 


IvjCCAGCTGT 


TTGATACTGG 


2400 


TCAACAAGTA 


CAGGGTACAC 


TAAGTTTTTA 


ATACCAACTT 


CTTCTATTTC 


AAATAAAAAA 


2460 


TCTTTATGTG 


TACTTTGTAA 


ATCTGTCATT 


TCGTTCTTAG 


TAGTAGGTTT 


CGTGCCTTCA 


2520 


ATAGGATCTA 


CGGAACCAAA 


GTGTTTCCAA 


CGACCTTCTC 


GTGTCGATAA 


ATCAAATTCA 


2580 


GTCATTTTTT 


TCCTCCGTTA AGATTTAAAG 


TGATATGTCC 


AATATGGTTC 


GACTGTTAAA 


2640 


AAGCTGTGTT 


GTTTACCATC 


GATTTCAGGA 


CTTGCTAATT 


GTTTTAAAAA 


TGGACCTGTT 


2700 


TGAGAAGCAT 


GTGCTTCAAA 


TGCCTTAATT 


TTAAGTTCTT 


TAAAATCTGT 


AATATCATTT 


2760 


TGAATATCAG 


GTTCTCCAAG 


AGCTTCGGTT 


GCATCATTAC 


TGAACGCAAC 


TAAAGTTAAA 


2820 


CGAGGGCGTT 


CTTCTTTAGG 


CATGCGTTCA ACCGTTCGAA 


TTACAGCGTC 


TGCTGTTGCT 


2880 


TCGTGATCAG 


GATGTACTGC 


ATATCCAGGA 


TAAAATGAAA 


TAATCAATGA 


TGGATTTGTA 


2940 


TCATCGATTA 


AAGATTTAAT 


CATACCATCT 


ATATGTTCAT 


AGGGTTCAAA 


TTCGACAGTT 


3000 


TTGTCACGTA 


AACCCATTTT 


TCTTAAATCA 


GTAATACCGA 


TAACTTTACA 


AGCTTCTTCT 


3060 


AGTTCACGCT 


CACGAATACT 


TGGTAATGAT 


TCGCGTGTTG 


CAAATGGGGG 


ATT AC CT AAA 


3120 
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TAATTTGCTA ATGTGCCTGC AGATGAGAAG 

AATACATGTC TTTCGTCAGT CATGTTGATG 

£ 

AATTTGAAGT GCTGCAGCGA GTTGACCTTC 

ATGCTCATTG ACCTCAAAAT GCGTTAGACC 

TTTAAGACCA ATGCGATAAG GTTCTTTATT 

10 

TATTTGTATG TTTCTTAAAA AAGTACCAGC 
ATAGGCCCCA TTTGTCGTTT CAACATGCAG 

1S TAAATCTATA ACTTCTTGTT CTTTAATTGG 
TGTGTTTATC TTTCTATTTT ACTAAAAACT 
TTTATAAATT AATTTTCATG AAGGGTAATT 

20 TTTTTTACTT TTAAAAATCA AAAATTTGTT 
GATGCTATAT TAATGGTGTA TGAATGAATT 
GAGGCATGTA AACAATGAAA GTATTAAACT 

25 

TTGCATGTGA GTTATATAAA GAGATGGCAT 
CTGGTGGTAC AATGACAGAT TTGTATGAAC 
TAAACGTAGA CAATGTATCC ACGTTTAATT 

30 

ATCCGCAAAG TTATCACTAT TATATGGATG 
ATAGAAAGAA CATTCATATT CCAAATGGAG 

35 AAATATAATG ACGTTTTAGA ACAACAAGGT 
GAAAATGGTC ATATTGGATT TAATGAACCT 
GTTGATTTGA CTGAAaGTAC TATTAAGGCT 

40 GTTCCAAAGC AAGCCATTTC GATGGGACTT 
TTACTCGCAT TTGGTGAAAA GAAACGTGCT 
TCTGTTGATG TTCCAGCCAC ATTACTTCAC 

45 

GACGAAGCTT GCCCGAAAAA TGTTGCGAAA 
TGTTTAATTA AGAAATGCCT CGGGAAAGGT 
ATGATTTTTA GTGGAATTAC AATTAGCAAT 

50 

GTTAGCAAAT AAAGTAAAAG ATTATGTAGA 
CAACGaAGGT TTACCAGCAG TTAAACATAT 

55 



GTTTCATCAT 


CAGGATGTGG 


AAATATTACT 


3240 


CCTCCTCTAT 


AAATTAAATG 


GTCGCTCACT 


3300 


GTAATTAAAA 


CCTGCAATTA 


AAAATTCATC 


3360 


TTGTACATAA 


ACCCAACCAC 


CATTTGATAG 


3420 


ACCACCTTTT 


AGTTGTGCAT 


GCGTATATGT 


3480 


ATTAAAAACA 


CGTTGATCGA 


AATGGTTCGC 


3540 


ATACACAGGT 


TTATGTTCAA 


AAGAAGCAAG 


3600 


TTCCAACACG 


TTCACTCCTT 


ACACTATCAA 


3660 


ATTCGATAAT 


TGTATACGAT 


TGCTCAATTA 


3720 


ACTCAGGATT 


ACGTAATCAT 


ACAGCATTAG 


3780 


GGAATTTGAA 


AAGTGTTAAA 


CATTAAAAAT 


3840 


CATAAGTTTT 


TAAAATGTAT 


TAAATTTGTG 


3900 


TAGGATCGAA 


AAAACAAGCA 


TCATTCTATG 


3960 


TTAATCAGCA 


CTGTAAACTA 


GGTTTAGCAA 


4020 


AACTTGTTAA 


GTTGTTAAAT 


AAAAATCAGT 


4080 


TAGACGAATA 


TGTAGGTTTA 


ACCGCATCAC 


4140 


ACATGCTTTT 


CAAACAATAT 


CCTTATTTTA 


4200 


ATGCCGATGA TATGAATGCG GAAGCGTgCA 


4260 


CAACGTGATA 


TTCAAATTTT 


AGGTATTGGT 


4320 


GGTACGCCGT 


TTGATAGCGT 


TACTCATATC 


4380 


AATAGTCGAT 


ATTTTAAAAA 


CGAaGATGAT 


4440 


GCTAATATTC 


TTCAAGCCAA 


ACGTATCATT 


4500 


GCTATTACAC . 


ATTTATTAAA 


TCAGGAAATT 


4560 


AAACACCCGA 


ATGTTGAGAT 


ATATTTAGAC 


4620 


ATTCATGTCG 


ATGAAATGGA 


TTGATTGCAA 


4680 


TCCAATAGAA AGATAAAAAG 


CATTGGAAGG 


4740 


TGATTTATTA 


AACAAAGAAG 


ACGCGGCTGA 


4800 


TATCGTAGAA 


ATCGGTACGC 


CAATCATTTA 


4860 


GGCAGACAAC 


ATTAGTAATG 


TAAAAGTATT 


4920 
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CGCGGATGTA 
AGAAGCTCAT 
AAAACGTGCA 
TGATTTACAA 
TAAAAATTCT 
CGCTGAAAGT 
AGAAGCTGCG 
ACTATCAATT 
TTTCAACTTT 
GTTCAGGATT 
ATGTTGTTGG 
CTGGTTCAGG 
CTGACATCGT 
ACATCGTTTT 
GTTTGTTTGA 
AAATGAATGT 
ATAGTCGATA 
AAATTATAGT 
TATATTTTTG 
CAAAGTGCAT 
ATGG6AATAC 
GCATTAGCAA 
ATTTATGAAT 
CTTTTCGTGG 
TTAAATCACT 
CCTGAAGGCA 
ATTGGTGATT 
GTCACTTGGG 
AATGATCCAT 



ATTACAATAC 
AAAAATAATA 
AAAGAACTAG 
GCAGAAGGGC 
AAAGTTGCAG 
CCTGATCTTG 
AAACAATGTC 
AATTCTAGAT 
TGCATCCAAA 
CGTGGCGAAT 
AGAATCAACG 
TTCCACGGAA 
ATTAATTACT 
GCCTGCAGGT 
ACAAGCATCT 
TACGGAACAA 
ATATGATGCC 
ATAATATCAA 
ATTTTGATGG 
TTAAAGCATG 
CTATTGAAGA 
AGTTAATCGA 
TTGCGGGTAT 
TGTCTAGTAA 
TGATTACCGA 
TACACACAAT 
CAACGTTTGA 
GTGCACATGA 
CAGAAATTAA 



TAGGTGTTGC 
AACAATTACT 
ATGAAATGGG 
AATCACCATT 
TAGCAGGTGG 
TTATTGTTGG 
GCGCTGCAAT 
GAATTAAAGA 
ATACTACATG 
AGTTTTGCAA 
ACACCTGCGA 
CATTTAAGAT 
ACAAATAAAG 
ACAAAATATG 
CAATTATTTT 
ACGATGCAAC 
T AGG CAG AAA 
TAATAAACGA 
TACGTTGGCA 
TGGCTTAACG 
ATCATTTTTA 
TACATTTAGA 
AACTGAAGCC 
GAAGAGTGAT 
AGCTGTTGGA 
TGTGCAACGC 
TGTTGAGATG 
TGCAAGGTCA 
TACCGTATTA 



AGAAGATGCA 
AGTTGATATG 
TGCTGATTAT 
AGAAAGTTTA 
AATTAAACCA 
TGGCGGAATC 
CGAAGGTAAG 
TGACTTTGTC 
CTGAACATAT 
TGCGCTTAAA 
TTAAGTCGAA 
TATTAGCAGA 
ATTCTGCAAT 
ATGAACAAGG 
TAGATAGTGT 
AAAATCATGC 
TATTATCGAT 
ATAGGGGTGT 
GACACGAAAA 
GAACCATCAT 
AAATTAGCAG 
CATACATATC 
ATTACAAGTT 
GTATTAGAAA 
TCCGATCAAG 
TACAATTTAA 
GCACAACGTG 
TTACTTCATT 
TAAAACTTGT 



TCAATTAAAG 
ATTGCTGTTC 
ATTGCAGTAC 
AGAACCGTTA 
GATACAATTA 
GCAAATGCAG 
TAATATGGCT 
ACATGTTGAA 
ATTTGTAGCT 
TCAGCTCGGC 
TGATGTATTT 
CAAAGCAAAA 
AGGCAATCTA 
CTCGGCACAA 
TGTAATGGGA 
TAATTTAGAA 
TATTTTTTTA 
TAATATTGAA 
AATGTGGTGA 
CTAAAGAAAT 
ACCGACCATT 
AATCTATTGA 
TGTATAACCA 
GAAATTTATC 
TAAGTGCATA 
ATAGCCAACA 
CTGGTATGCA 
CAAATCCGGA 
TAAAACAGAG 



CAGCTATTGA 
AAGATTTAGA 
ACACTGGTTA 
AATCTGTTAT 
AAGATATTGT 
ATGATCCAGT 
AAATTTAGTG 
GCGGATGAGT 
GGCAAAGGAC 
AAACAGGCAC 
GTAATTATCT 
TCAGTAGGTG 
GCTGGGACGA 
CCATTAGGAA 
TTGATGACTG 
TAAAATAAAG 
TTTAAATAAT 
GTTTGACAAT 
AGTAGCAACA 
AACGCATTAT 
AGATGAAGCA 
AAAGGACTAT 
AGGGAAAAAA 
GGCTATTGGA 
TAAACCAAAT 
AACGGTGTAT 
ATCTGCAGCT 
TTTTATTATT 
AATACCATGG 



5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 
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7A TTT 7A R * IVT*J\ 
Ai 1 


AATATTTATT 


AAACATTATG AATTTTTAAA 


GAGTAATGTC 


TGACTCGTTG 


6840 


ft ^Pft ft ^P^P*T*A r ^^n 
AiAAi 1 1 ATT 


TTTGTAAAAA 


TAAATTAAAG TAATGACAAA 


GTTATTGAAG 


TAAATTGAGT 


6900 


MiAAALAl TT 


AAATACGATG 


TCGAAAATGG CGATAGCATA 


TCACTTACAT 


GAAGTTGTGT 


6960 


w C L A I eta L TA 


TTTTTAGTTA 


TAATTCCAAA AAGTTAATCG 


TTCGATGATT 


TAAGAATTAT 


7020 


1A1 HjI 1 lAA 


TTCAAATGTA 


TGAGGGTATA AAATCATTGA 


ATTTAATTCG 


ATAAAGCGAA 


7080 


ATTTTTGAAC 


AAACATACTT 


TTGTATTTAT ATAAAAGTTT 


AAATTCTTAT 


AAATTTGACA 


7140 


AAACTAATTA 


ACTCCGTATA ATTATGAAAC ATACAAGAGG GAGTGTATGA 


ATTCATGGAT 


7200 


TTTAATAAAG 


ASaAAlAl 1AA 


CATGGTGGAT GCAAAGAAAG 


CTAAAAAAAC 


CGTTGTTGCA 


7260 


ACCGGTATCG 


vjIAAHjCAAT 


GGAATGGTTC GATTTTGGTG 


TCTATGCATA 


TAcAACTGCG 


7320 


TACATTGGAG 


IV TV I ■ 1 nil 

LvjAAL 1 1 L. 1 J. 


CTCTCCAGTA GAGAATGCAG 


ACATTCGACA 


AATGTTGACT 


7380 


TTCGCAGCAT 


lAVjUUAi ItjC 


GTTTTTATTA AGACCAATTG 


GTGGTGTCGT 


ATTTGGTATT 


7440 


ATTGGTGACA 


A AT a rr 


**p TV TV TV P TTP'P TV 4 f Mt i « h><iiwi« 

IAAAvjTTGTA TTAACATCTA 


CAATTATTTT 


AATGGCATTT 


7500 


TCAACATTAA 




Al IbLLAAbL 1 ATGATCAAA 


TTGGACTTTG 


GGCACCAATA 


7560 


CTATTATTGC 


TTGCAAGAGT 


-MV- 1 Av^AAVjtjij 1 1 1 i LftALAO 


GTGGAGAGTA 


TGCGGGGGCA 


7620 


ATGACATATG 


TTGCCGAATC 


ATPTCT'AG AT & Anr77T/'VP A 


AL I CATTAGG 


TAGTGGACTA 


7680 


GAAATTGGGA 


CATTATCAGG 


TTAPATAGrT GrTTPAATTA 


J. LjA I TGCTGT 


ATTAACATTC 


7740 


TTTTTAACAG 


ATGAACAAAT 


GGfATfATTT GGTTGflArzAA 


TCCCATTCTT 


ACTCGGTTTA 


7800 


TTCCTAGGAT 


TATTCGGCTT 


ATATTTACGT CGTAAGCTGG 


AAGAATCACC 


AGTTTTCGAA 


7860 


AATGATGTTG 


CAACACAACC 


AGAAAGAGAT AACATTAACT 


TTTTACAAAT 


CATCAGATTT 


7920 


1 A 1 1 ACAAAG 


ATATATTTGT 


ATGTTTTGTA GCTGTTGTAT 


TCTTCaATGT 


TACAAACTAT 


7980 


& TWf 7A TV OTP 


CATATTTACC 


AACCTATTTA GAACAAGTTA 


TTAAATTAGA 


TGCAACGACA 


8040 


A\_AAVj 1 vj 1 A I 


TAATTACTTG 


TGTCATGGCA ATAATGATTC 


CATTAGCATT 


AATGTTTGGT 


8100 


AAvji iAvjLtiO 


ATAAAATAGG 


TGAAAAGAAA GTATTTCTAA 


TTGGTACTGG 


TGGGCTAACA 


8160 


1 iAi I CAGTA 


TCATCGCATT 


TATGTTATTA CATTCACAAT 


CATTTGTTGT 


AATAGTAATC 


8220 


GGTATATTTA 


TATTAGGATT 


TTTCTTATCA ACTTACGAAG 


CGACAATGCC 


AGGGTCGTTA 


8280 


CCAACGATGT 


TTTACAGTCA 


TATAAGATAT CGAACTTTAT 


CAGTAACATT 


TAATATCTCT 


8340 


GTTTCGATAT 


TTGGTGGTaC 


GaCGCCATTA GTkGCAmCaT 


GGTTaGTTAC 


GAAAACTGGA 


8400 


GATCCATTAG 


CmCCTGCGTA 


TTATTTAACA GCAATCAGTG 


TTATTGGCTT 


TTTAGTTATT 


8460 


ACATTCTTAC . 


ATTTAAGTAC 


AGCAGGAAAA TCTCTAAAAG 


GTTCGTATCC 


AAATGTAGAT 


8520 
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GAACGTAAGA ATTAGAGATT TTAATaAAAA GTATAAATCA ATCGTATATA AGCACTTTAA 864 0 

AGCTAGTAGG TTCTGCTAAC TTTAAAGTGC TTTTTAAATT GAGAACTGTA ATTAGCCGTA 8700 

ATAAAGTTTT TGTATATACA TAAACCCCCA CTGCAATGAT TATCGCAATG GGGGAAAGAG 8760 

GGGACTTAAA GCATATGTTT AGCTTTGAAT ACTTAAAATT CTCTTGCTAT TGAAATGTTA 8820 

GGATGTAAAT ATGTCTTAGA GTATTTTGTC CAACGCAATT AATATTGAGA CTCTAACCTT 8880 

CAATATTATT ATAGAGAACA CAAACTTAAA TAGATTGGGT GACTTATTTG TGTCAGTTAT 8940 

TGCGATTGCG ATAACTTCTT TTCTCTATAT ACATATAGTA ACGTCTTATC TAATAAAAAA 9000 

CATGGTACTA CAGTATCAAA TTTATCTAGG GCTTAAGTTT GATTTTTATA ATAGGCAGGT 9060 

TTACCTGATA AAAATACTTA TTCATTATAT AATGTTAACA ATATGTATTT TAAAGTTTAC 9120 

ATTGAGTGAG GGATATTGAT GAACGTAATT TTAGAACAGT TGAAAACACA TACTCAAAAT 9180 

AAACCTAATG ACATAGCATT ACATATCGAT GATGAAACAA TTACATATAG TCAACTAAAT 924 0 

GCCCGCATCA CTAGCGCAgT TGAATCTTTG CAGAAATATT CACTTAACCC TGTCGTTGCT 9300 

ATTAATATGA AATCACCGGT GCAAAGTATT ATTTGTTATT TAGCTTTGCA TCGTTTACAT 9360 

AAAGTGCCTA TGATGATGGA AGGTAAATGG CAAAGTACTA TACATCGTCA ATTGATTGAA 9420 

AAATATGGTA TTAAAGATGT AATTGGAGAT ACAGGTCTCA TGCAGAATAT AGACTCACCG 9480 

ATGTTTATTG ATTCAACGCA ATTACAGCAC TACCCCAATT TATTACATAT TGGTTTTACT 954 0 

TCAGGGACAA CTGGACTGCC AAAAGCATAT TATCGTGATG AAGATTCATG GTTGGCTTCT 9600 

TTTGAAGTTA ATGAAATGTT GATGTTAAAA AATGAAAATG CAATAGCAGC CCCTGGACCA 9660 

CTATCGCACT CGTTAACATT ATATGCGTTA TTGTTTGCTT TAAGTTCCGG TCGTACTTTT 9720 

ATAGGACAGA CCACTTTTCA TCCTGAAAAG TTACTTAATC AATGTCATAA AATATCATCA 9780 

TACAAAGTTG CTATGTTTCT TGTTCCAACG ATGATTAAAT CATTATTGTT AGTTTACAAC 984 0 

AATGAACATA CAATCCAATC ATTTTTTAGC AGTGGAGATA AGCTGCATTC TTCTATTTTT 9900 

AAAAAGATAA AAAATCAAGC AAATGACATA AATTTGATTG AATTTTTTGG TACATCGGAA 9960 

ACCAGTTTTA TCAGCTATAA CTTGAATCAG CAAGCACCAG TTGAATCAGT AGGTGTGCTA 10020 

TTTCCAAATG TGGAATTGAA AACAACGAAT CACGATCACA ATGGTATAGG AACTATTTGT 10080 

ATAAAAAGTA ATATGATGTT TAGTGGCTAT GTAAGTGAAC AATGTATAAA TAATGATGAA 1014 0 

TGGTTTGTTA CTAATGATAA TGGCTATGTA AAAGAGCAGT ATTTATATTT AACGGGACGT 10200 

CAACAGGATA TGTTAATTAT TGGTGGTCAA AATATATATC CAGCACATGT TGAACGCCTT 10260 

TTAACGCAAT CTTCGAGCAT TGATGAAGCA ATTATCATCG GTATTCCAAA TGAGCGTTTT 10320 
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CAATTTTTAA AAAAGAAAGT GAAaCgnTaT GAAATTCCAT CGATGATTCA TCATGTAGAA 


10440 




AAGATGTATT ACACTGCAAG tGGTaAAATT GCTAGAGAAA 


AAATGATGTC 


GATGTATTTG 


10500 


5 


AGAGGTGAAT TATAATATGA ATCAAGCAGT CATAGTTGCA GCTAAACGAA 


CTGCATTTGG 


10560 




GAAATATGGT GGCACTTTAA AACATTTAGA GCCaGAACAA 


TTGCTTAAAC 


CTTTATTCCA 


10620 


10 


ACATTTTAAA GAGAAGTATC CAGAGGTAAT ATCTAAAATA 


GATGATGTAG 


TTTTAGGTAA 


10680 


TGTTGTTGGG AATGGTGGCA ATATTGCAAG AAAAGCATTG CTTGAAGCGG GGCTTAAAGA 


10740 




TTCAATACCT GGCGTCACAA TCGATCGGCA ATGTGGGTCT GGACTTGAAA GTGTTCAATA 


10800 


15 


TGCATGTCGC ATGATCCAAG CCGGAGCTGG CAAGGTATAT ATTGCAGGTG 


GTGTTGAAAG 


10860 




TACAAGTCGA GCACCTTGGA AAATCAAACG ACCGCATTCT 


GTGTACGAAA 


CAGCATTACC 


10920 




TGAGTTTTAT GAGCGTGCAT CATTTGCACC TGAAATGAGC 


GACCCATCAA 


TGATTCAAGG 


10980 


20 


TGCTGAAAAT GTGGCCAAGA TGTATGATGT TTCAAGAGAA 


TTACAAGATG 


AATTTGCTTA 


' 11040 




TCGAAGTCAT CAATTGACAG CGGAAAATGT AAAGAATGGA AATATTTCTC 


AGGAAATATT 


11100 




ACCTATAACC GTTAAAGGAG AAATATTCAA CACTGATGAA 


AGTCTAAAAT 


CACATATTCC 


11160 


25 


GAAAGATAAC TTTGGCCGAT TTAAGCCCGT GATCAAAGGT 


GGGACCGTTA 


CCGCTGCGAA 


11220 




TAGTTGTATG AAAAATGATG GTGCAGTTTT ATTGCTTATT 


ATGGAAAAAG 


ATATGGCATA 


11280 


30 


CGAATTAGGT TTCGAGCATG GTTTATTATT TAAAGATGGT 


GTTACGGTAG 


GTGTTGATTC 


11340 


TAATTTTCCT GGCATTGGTC CAGTACCAGC CATTTCCAAC 


TTACTAAAAA 


GAAATCAATT 


11400 




AACGATAGAA AATATTGAAG TCATTGAAAT TAACGAAGCG 


TTCAGTGCAC 


AGGTAGTTGC 


11460 


35 


CTGCCAACAA GCTTTAAATA TTTCAAATAC GCAATTAAAT 


ATATGGGGTG 


GTGCATTAGC 


11520 




ATCAGGTCAT CCATACGGTG CAAGCGGTGC CCAATTAGTG 


ACTCGATTAT 


TTTATATGTT 


11580 




TGACAAAGAG ACTATGATTG CATCTATGGG GATAGGGGGA 


GGTCTAGGAA 


ATGCAGCATT 


11640 


40 


ATTTACTCGA TTCTAACCAG CGATTAAATG TGTCATTTTC 


TAAGGATAGT 


GTGGCTGCAT 


11700 




ATTATCAGTG TTTTAACCAA CCTTATAGAA AAGAAGTACC 


ACCATTAATG 


TGTGCGTCAT 


11760 




TATGGCCAAA ATTTGATTTA TTTAAAAAAT ATGCAAATAG 


CGAACTGATT 


TTAACAAAAT 


11820 


45 


CAGCAATTAA TCAAACTCAA AAGATAGAAG TAGACACAAT 


ATATGTAGGG 


CATTTAGAAG 


11680 




ATATTGAATG CCGACAGACT CGCAATATCA CACGTTATAC AATGGCTTTA ACATTAACTA 


11940 




AAAATGATCA ACATGTCATA ACGGTtACAC AAACTTTTAT 


TAAGGCGATG 


AAGTAGAGAT 


12000 


50 


GGAGTTTAAT GAGATATGGA TAAATGAATA TTTGGCGCTC 


GTAAATGATG 


ATAATCCAAT 


12060 




ACATAATGAG ATTGTGCCAG GACAATTAGT GAGTCAAATG 


ATGCTGATGG 


CTATGTCATT 


12120 
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ATTCATTGAA CAACACGAAC ACGAAATTAT AGCAATTAAT GACGATGGAG AGATTAAAAT 12240 

AAAAATTTCT TTGAGCACAA AAAAATAACC GATATTAGCT GCATGAACGC ATATTAATTA 12300 

GGAGATGAAA GGACAGCTAA TATCAGTTAT GTATTGTTAT TATTATTGGG AACAGAGATG 12360 

AATATAGGTT ACGTTTCTTT CTTTGCACGG GGATGCATTA ATCTAAAATA ATAATAACAA 12420 

CTATATCAAT GTTTAATAAA TTCTGGATTA TTGGAACGAT TAGTCAATTT AACTAACTTT 12480 

CATATGATCT ATATCGTCTT GTAATAAAGA GAGCAATTTG AATATTTCAG TATCACTAAA 12540 

TGAATCGTCA CATTTAATTG AAACATGCTG AAACGTTTTG GTTATAATTT CATAAACTGG 12600 

TGCGCCTTCA TGGTGATACT GTCGATAAAT AATCATAACC TATATTACCT CCTTTGCTAC 12660 

TCTATGGTTA TATTATAAAT AACATTTTTA TGTGTGACAT CAACCTTAAG TATCAACTTT 12720 

TTATCAGACA TAGAACGTAT GATTTACTAA GACTATTTAT GTATAAAAGT TCTAAATAAA 12780 

TATATATTTA TAGAGTCGCC TGGCAGTCAT TTGGGaAATA TAACATATAT GATTAGAGAG 12840 

GCATCTATCG CAAAAGAATG ATAATGATAG AGGTATTGAG CATATAGATG AGTTTAAGTT 12900 

CATCTTGAAA ATAAAGGGTT ATTTAGTCAT AGATGTAGAT GTATAGGAAA TATTTGTATG 12960 

TATTGTTCGA TATGTATGAA ATTTTCAATA AAAGCTAATA ACGCTTATAT GTAACTTTCA 13020 

AATTTAAATT ATATACAGAG CATGATGATT ATAAAAAAAT AACCACATCA CATAAATTGA 13 080 

GTTCATACCC AATTTAAGTG GTGTGGCTAA TAATGTTGAT TTATAGATGA ACCGCCTAAT 13140 

CGTTAAACCT CTGTTACTTC AACATCGATA TGTTCAATAC GGTTGTATGC ACCGTGATCC 13 200 

ACAGGACCAA CAAAATCATT CATTTTCCAA CCGTTTTTAA TAGCAGAAGC GACGAAAGCT 13260 

TTCGCGCTAA TCACAGCTTC TTTCGGTGAC TTACCGTTAG CTAAATATGC AGTTGTTGCC 13320 

GCAGCAAATG TACAACCAGC ACCATGGTTA TAACTTTGTT GGAACATGTC TGTTGTTAGT 13380 

TGATAAAATG TTTGACCATC ATAGTATAAG TCATACGATT TATCTTGATC TAAAGCTTTG 13440 

CCACCTTTAA TGATGACATG CTGTGCGCCT TTATCAAAGA TAATTGTTGC AGCCTTTTTC 13500 

ATATCTTCAA TTGAATTTAA TTTACCTAAT CCTGATAATT GACCCGCTTC AAATAAGTTT 13560 

GGTGTCACTA CCGTTGCTTT AGGTAGTAAA TATTTAATCA TCGCCTCAGT ATTTCCAGGA 13620 

TTAAGCACTT CATCTTCGCC TTTACAAACC ATGACAGGAT CTACTACAAA ATATTGTGCA 13680 

TTAGATGCCT GATATACTTC TCCAGCACGT TTGATTATCT CCTCAGTACC TAACATACCT 13 740 

GTTTTAATAG CATCAGGTCC GATTGATAAA GCCGTTTCAA GTTGTTTTTC AAATACATCC 13800. 

ATTGGTAATG GTGTAACATC GTGTGACCAT GTATCTTTAT CCATAGTAAC GATGGCAGTT 13860 

AAAGCGACCA TGCCATACGT ATCTAATTCT TGGAACGTTT TCAAATCTGC TTGCATACcT 13 920 
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CACTCCTACA TAATAATATT GTATTCATCA 

AGCATTCAAT ATTTGATGAT TGTTGAAATG 

5 

TGTCATTCAC TTTAGATAAG TGTGATATGT 

AATGGTCGCA AATTTTTCAT GACATAACAA 

TTTTAGAAAA AGAATATTCG ACTGCAATCG 

10 

CGTTTGATTT AACACCGTTT GAAAATATCA 
ATGGTCCAAA CCAAGCACAT GGATTAGCAT 

15 CATCTTTACG TAATATGTAT AAAGAATTAG 

CGCATTTACA AGATTGGGCA AGAGAAGGCG 
GACAGGGTGA AGCAAATTCT CATCGTGATA 

20 TTAAAGCAGT GTCTGATTAT AAAGAACATG 

AGCAAAAAAT AAAGCTTATC GATACATCTA 
GTCCACTGTC TGCATATAGA GGATTCTTTG 

25 ATTTAGAGTC AGTAGGAAAA TCACCAATTA 

ATAGAGAAAC TTTAATAGCA CGAATTGAGC 
ATGACCATGA CTTTGAAAAA CATATGTATG 

30 

CAACATCAAA TACACCACAT ATTGGTGAAC 
ATCAAATGCC ACAATCACAA ATAACGCAGC 
AAGCGATGGG TGGTAAAGTA AATACGCATT 

35 

AACCTTCAAA CCAACAACAA AGATTAGCGA 
TATttGATTT TTAAAAAGCA ACAATGAAAC 

40 GGTTAATAAT CAAGACGCAT ATACTTTTAT 

ACTGAATTAT ATAAGGAGAG GTAGCAATGA 
CGATGATGGC TGTCGGTACA GGTGCATTTG 

45 ATCACTATTT ATCAGTATGG GAAAAAGCAA 

TATTAATTAT AGGTGTAATT AGTGGTACAA 
TAATATTTGC TGGTATTATT TTCTTTAGTG 

50 

TTAAAGTTTT AGGTGCGATT ACGCCAATTG 
TGTTAATCAT TGCGACATTC AAATTTGCTG 

55 
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TATCATTTTT AACCTAATTG AAAAATATTA 14 04 0 

AATCATTCAT ACTATTGTAA CTTTTGAAAA 14100 

TAAAATATGT CCTGAGGTGA GATTGAATGG 14160 

CGAAACATGA CTTTAAAGCT ATGCATGATT 14220 

TATACCCTGA TAGGGAAAAT ATATATCAAG 14280 

AAGTTGTTAT ATTAGGACAA GACCCGTATC 14340 

TTTCAGTGCA ACCTAACGCA AAATTCCCTC 14400 

CAGATGATAT TGGATGCGTT AGACAAACAC 14460 

TCTTGTTATT GAATACAGTT TTAACCGTAA 14520 

TTGGTTGGGA AACATTTACT GATGAAATTA 14580 

TTGTCTTTAT TTTGTGGGGG AAACCTGCAC 14640 

AACATTGTAT TATAAAATCA GTGCATCCTA 14700 

GATCAAAACC GTATTCCAAA GCGAATGCCT 14760 

ATTGGTGTGA AAGTGAGGCG TAGATGTTGA 14620 

AAGAATTAGT ACAAGCAGAG CAGGCACAGC 14 880 

CCATACATAT ATTAACATCT TTATATGCTT 14 940 

AACAAATGAA TCGTCGTATT GCTAACCATA 15000 

CAACTCATCA AGTGACAGTT GCTGAAATTG 15060 

CAGCACATCA TCATAATAAG TCATATTCAC 15120 

CAGATGATGA CATTGGCAAT GGTGAATCCA 15180 

ATAATTACTT AATAGCTTGT TAAGTATGTA 15240 

TCGAGTGTTC GGATTTAAAC ATTTATTAAT 15300 

AATTATTTAT TATTTTAGGT GCATTAAACG 15360 

GTGCGCATGG TTTACAAGGA AAAATAAGTG 15420 

CGACGTATCA AATGTACCAT GGCTTAGCAT 15480 

CTTCAATCAA TGTTAACTGG GCTGGCTGGT 15540 

GATCATTATA TATTTTAGTA TTAACTCAAA 15600 

GTGGCGTATT GTTCATCATT GGATGGATAA 15660 

GTTAAATTTT AAAACTTTAG ATTACCTATG 15720 
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TGGGTATAGA ATACCTTCGA GGTGAGTTTT TATTTATGGA AAAAAAGAAT AAGCAAATAG 15840 

ATAGAGGCGA TTTAAAACAA AACCTATCTG AAAAGTTTGT ATGGGCGATT GCATATGGTT 15900 

5 CATGTATCGG ATGGGGCGCA TTCATCTTAC CAGGAGACTG GATTAAGCAG TCAGGTCCGA 15960 

TTGCAGCATC AATTGGTATA GTTATTGGTG CATTATTAAT GATATTAATT GCGGTTAGTT 16020 

ATGGCGCATT AGTAGAGAGA TTTCCAGTAT CAGGGGGCGC GTTTGCCTTT AGTTTCTTAA 16080 

10 

GTTTCGGCAG ATATGTGAGT TTCTTCTCAT CATGGTTTTT AACTTTTGGT TATGTCTGTG 16140 

TCGTTGCTTT AAAtGCGACC GCATTCAGTT TACTAGTTAA ATTCTTATTG CCAGATGTCT 16200 

15 TAAATAATGG GAAACTATAC ACCATTGCGG GCTGGGACGT TTATATTACG GAAATCATTA 16260 

TTGCGACCGT ATTACTACTT GTATTCATGC TAGTAACGAT TCGTGGCGCA AGTGTATCTG 16320 

GATCATTACA ATATTATTTC TGTGTGGCGA TGGTAATCGT CGTATTATTG ATGTTCTTTG 16380 

20 GTTCATTCTT TGGTAATAAT TTTGCACTTG AAAATTTACA ACCGTTAGCT GAACCTAGCA 1644 0 

AAGGATGGTT AGTGTCTATT GTGGTTATTG TATCCGTGGC ACCATGGGCA TATGTTGGAT 16500 

TTGATAATAT TCCACAAACA GCAGAAGAGT TTAACTTTGC ACCAAACAAG ACATTTAAGC 16560 

25 TTATCGTGTA CAGTTTATTA GCAGCATCAT TAACTTATGT TGTCATGATT TTATACACTG 16620 

GTTGGTTATC AACAAGTCAT CAAAGTTTAA ATGGGCAGTT GTGGTTAACA GGTGCTGtTA 16680 

CACAAACAGC ATTTGGTTAT ATTGGATTAG GTGTATTAGC AATTGCAATT ATGATGGGTA 16740 

30 

TATTTACTGG TTTAAATGGA TTCTTGATGA GTTCAAGTCG CTTGTTATTT TCTATGGGAC 16800 

GTTCAGGTAT TATGCCAACA ATGTTTAGTA AATTACATAG TAAATACAAA ACACCATATG 16860 

TCGCAATCAT ATTCCTAGTA GGAGTGTCGT TAATTGCACC TTGGCTAGGA AGAACTGCAT 16920 

35 

TGACTTGGAT TGTAGATATG TCATCTACTG GTGTATCCAT TGCCTACTTT ATTACATGTT 16980 

TGT(5*GCAGC GAAATTATTC AGTTATAACA AACAAAGTAA TACGTATGCA CCGGTTTACA 17040 

40 AAACGTTTGC TATTATCGGC TCATTTGTAT CATTCATTTT CTTAGCGTTG TTATTAGTGC 17100 

CAGGTTCTCC TGCAGCACTG ACTGCACCGT CTTATATTGC ATTACTTGGA TGGTTAATCA 17160 

• TCGGTTTAAT ATTCTTTGTG ATTCGATATC CTAAATTGAA AAATATGGAT AATGATGAAT 17220 

45 TAAGTCGCTT GATTTTAAAT AGAAGTGAAA ATGAAGTTGA TGATATGATT GAAGAACCTG 17280 

AAAAAGAAAA AACTAAATAA TAAAAGAATC GCACAATAAA CCTTCTTCAT TCGGAGGCGT 17340 

ATCGTGCGAT TTTTTGTATT ATAAATTGAC ATTTAAGACG AGGCAGCTGA ACCTTATATA 17400 

SO 

TAATTGCTAA GAGTTAGGGC TGAGCCATTT CTAACAAATA TTTATAATCG TTTAAAAGAT 17460 

TTCACGAACC CAGAAACAAT TAATTTGGAA ATTTGGTCGG CGAATAATAA ACCTAATGCG 17520 
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AAGACTAAAT TTTTTGTAGC ATCGTATGCT AAGCCACCAG GTACTAATGG AATGATACCC 1764 0 

GTTACCATAA AAATGATGGC AGGTTCTTTT TGTTTACGAG CCATATAATG ACTTAACAAG 17700 

5 CCTAATGCTA AACTACCAAA GAAACTAGAG TATATAGTGT GCACATTAAA GCCGTTGAAG 17760 

AATAAGGTGT AAACCATCCA TCCACACGTA CCAACGAAAC CACATGATAG ATATAATTTT 17820 

CTAGGTGCAT CAAAAATGAC GCAGAA 17846 

10 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 5544 base pairs 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



/ 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



25 



30 



35 



40 





wvj lunnnu 1 *\ 






O 21 21 21 21 TYTtfS 21 H 


2i ptt 2i n n 2i or* 






fZ 21 TCZ 21 A Ptt fi C 


f 21 ITTll 21T 21 T 


nil nuun 1^1 


V TP A 21 21 TP21 




X W V 


TGGTAATCAA 


TCACCGCAAA 


TAATTATTCA 


AGATATTGCG 


ATGAATGAAC 


AGCAAATATT 


180 


AGATTATAGA 


AGTAAGCGAA 


AAAGTTTACC 


TTTTACAGAA 


AATGATGAAA 


ATATTGTCGT 


240 


GCTTATTCAT 


CCTAAAAGTG 


ATAAAGTAAA 


TGCGAATGAA 


TATTATTATG 


GTGAAGAAAT 


300 


TAAACAACAA 


ACTGATAAAG 


TAGTATTAAG 


AGATTTACCA 


ACGTCAATGG 


AAGACTTGTC 


360 


TAATTCCTTG 


CAACAACTGC 


AATTTTCTCA 


ACTTTATATA 


GTTTTGCAAC 


ATAATCATTC 


420 


GATTTACTTC 


GATGGTATAC 


CTAATATGGA 


TATTTTTAAA 


AAGTGTTATA 


AAGCATTAAT 


4 80 


AACTAAACAA 


GAAACAAATA 


TCCAGAAAGA 


GGGTATGTTA 


TTGTGTCAAC 


ATTTAAGTGT 


540 


GAAACCAGAT 


ACACTTAAAT 


TCATGTTGAA 


AGTTTTCTTA 


GACTTAAAAT 


TTGTAACACA 


600 


AGAAGATGGT 


TTAATTCGAA 


TCAATCAACA 


ACCTGATAAA 


AGATCGATTG 


ATTCCAGCAA 


660 


AGTATATCAA 


TTAAGACAAC 


AACGTATGGA 


TGTTGAAAAG 


CAATTATTAT 


ATCAAGATTT 


720 


TTCAGAAATA 


AAAAATTGGA 


TAAAGTCACA 


ATTGTCGTGA 


GCAATTTAGG 


AGGAAATATT 


780 


AATGGATTTA 


AAGCAATACG 


TATCAGAAGT 


TCAAGATTGG 


CCGAAACCAG 


GTGTTAGTTT 


840 


CAAGGATATT 


ACTACAATTA 


TGGATAATGG 


TGAAGCATAT 


GGCTATGCAA 


CAGATAAAAT 


900 


TGTAGAATAC 


GCAAAAGACA 


GAGATGTTGA 


TATCGTTGTA 


GGACCTGAAG 


CGCGTGGCTT 


960 


TATCATTGGC 


TGTCCTGTAG 


CTTATTCAAT 


GGGGATTGGC 


TTTGCACCTG 


TTAGAAAAGA 


1020 


AGGGAAATTA 


CCTCGTGmAG 


TCATTCGTTA 


TGAGTATGAC 


CTAGAATATG 


GTACAAATGT 


1080 
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ATTAGCTACT GGTGGTACGA TTGAAGCAGC 
CGTAGTAGGT ATTGCATTTA TAATTGAATT 

5 

AGATTACGAT GTTATGAGTT TAATCTCATA 
AATGAAATCC TTCATCAAAT GTATAAGAAC 
TTTCTTAACA TGAGATGTTA GGATTTTTTA 

10 

ATACCTTAAT AACATCGTTT ATTTATTTCA 
AAAAATGAAA CAGTAGATTT AGGTCGAATT 

1$ TACAAATTAA ACTCGCTCAA GTAAAATTAA 

TTATCGTCGA CGGACGTATG ATTGGTGTGG 
TCATTGTTTA AGGCGAAGTA ATAAATATGA 

20 ATCCATATAG TGCAGACGAA tTCTTCACAA 

TGAGTATGTT TTAAAAAGCT ATCATATTGC 
AAACGGATTA CCATACATTA TGCATCCTAT 

25 ATTAGACGGA CCGACGATTG TCGCAGGTTT 

TACATTTGAA GATGTAAAAG AAATGTTCAA 
GACGAAGCTT AAAAAAGTAA AATACCGCTC 

30 • 

CAAGTTATTT ATTGGGATTG CCAAAGATGT 
ATTACATAAT ATGCGTACCT TGAAAGCCAT 
AGAAACATTA GAAATTTATG CACCATTAGC 
GGAACTAGAA GATACGGCTC TTCGTTATAT 
TTTAATGAAG AAGAAACGTA GTGaACGTGA 

40 ACGTACTGAA ATGGACCGAA TGAATATCGA 

TTACAGTATT TATCGGAAAA TGATGAAGCA 
GTTGGCGATA CGTGTTATTG TCAATTCTAT 

45 GCATACGTTA TGGAAACCGA TGCCAGGACG 

AAATTTGTAT CAGTCATTGC ATACTACAGT 
CCAAATACGA ACGTTTGATA TGCACGAAAT 

50 

TTACAAAGAA GGTAAAAAAG TAAGTGAAAA 
GTTAAAAGAA TTAGCTGAAG CGGATCATAC 

55 



AATAAAATTA GTTGAAAAAT TAGGCGGTAT 1200 

GAAATATTTA AATGGTATTG AAAAAATTAA 1260 

CGACGAATAA TAAATAATAT AATTTTATCA 1320 

CAATGACTTA ATTAAAAAAG TTGTTTAAGT 1380 

TTTACTGAAA ATGTTAGATG ATTGAGCATT 144 0 

TAAATTGTAG TATCATAGAA CTAATATTTA 1500 

TTTGTAAAAG TTTTAAAAGT AGGAATAGTA 1560 

TATTACGATT AATGACGACA GGATAAATAT 1620 

GACAAATACT ATTCAACAAG AGTACCTAAA 1680 

ATGGGGTGTA TCATATAATG AACAACGAAT 1740 

AGCAAAATCA TATTTGTCAG CAGATGAATA 1800 

TTATGAAGCA CATAAAGGTC AGTTCCGAAA 1860 

ACAAGTTGCA GGTATTTTAA CAGAAATGCG 1920 

TTTGCATGAT GTAATTGAAG ATACACCGTA 1980 

TGAAGAAGTT GCTCGAATTG TTGATGGTGT 2040 

AAAAGAAGAA CAACAAGCTG AAAATCATCG 2100 

ACGCGTAATT TTGGTGAAAT TAGCAGACAG 2160 

GCCGCGCGAA AAACAAATTA GAATTTCTCG 2220 

ACATCGTCTT GGTATTAATA CAATCAAATG 2280 

TGATAATGTG CAATATTTTA GAATAGTCAA 2340 

AGCGTATATC GAAACGGCTA TTGATAGAAT 2400 

AGGCGATATA AATGGTAGAC CTAAACATAT 2460 

GAAAAAACAA TTTGATCAAA TTTTTGATTT 2520 

TAATGATTGT TATGCGATAC TTGGGTTGGT 2580 

TTTTAAAGAT TATATTGCAA TGCCTAAACA 2640 

AGTAGGCCCA AATGGAGACC CGCTCGAAAT 2700 

TGCTGAGCAT GGTGTTGCAG CACACTGGGC 2760 

AGATCAAACT TATCAAAATA AGTTAAATTG 2820 

ATCGTCTGAC GCTCAAGAAT TTATGGAAAC 2880 
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• TGAGTTGCCA TATGGTGCTG TGCCGATTGA TTTTGCTTAT GCGATTCACA GTGAAGTAGG 3000 

TAATAAGATG ATTGGTGCCA AGGTGAATGG CAAAATTGTA CCAATTGACT ATATTTTACA 3060 

AACAGGCGAT ATTGTTGAAA TACGTACTAG TAAACATTCA TATGGACCAA GTCGTGATTG 3120 

GTTGAAAATT GTTAAATCGT CTAGTGCCAA AGGTAAAATT AAAAGTTTCT TCAAAAAACA 3180 

AGATCGTTCA TCTAATATTG AAAAAGGCCG AATGATGGTT GAAGCTGAAA TAAAAGAGCA 3240 

AGGATTTAGA GTCGAAGATA TTTTGACAGA GAAAAATATT CAGGTTGTTA ATGAAAAATA 3300 

TAACTTTGCA AATGAAGATG ATTTATTCGC AGCTGTAGGA TTTGGCGGCG TGACATCCTT 3360 

ACAGATTGTT AATAAATTAA CTGAAAGACA ACGTATTTTA GATAAACAAC GTGCTTTAAA 3420 

TGAAGCACAA GAAGTTACGA AATCATTGCC TATTAAAGAC AACATCATTA CTGATAGTGG 34 BO 

TGTCTATGTA GAAGGTTTAG AAAATGTACT TATCAAGTTG TCAAAATGTT GTAATCCTAT 3540 

20 ACCaGGTGAT GATATTGTAG GTTATATCAC CAAAGGTCAC GGTATTAAAG TACATCGCAC 3600 

TGATTGCCCA AATATTAAGA ACGAAACTGA ACGACTAATT AATGTTGAAT GGGTAAAATC 3660 

AAAAGACGCA ACTCAAAAAT ATCAGGTTGA TTTAGAGGTA AtGCGTATGA CCGAAATGGC 3720 

25 TTCTTGAATG AAGTACTACA AGCTGTTAGC TCGACAGCCG GCAATTTAAT TAAAGTTTCA 3780 

GGACGTTCAG ATATTGATAA AAATGCAATA ATAAATATTA GTGTCATGGT GAAAAACGTG 3 84 0 

AATGATGTTT ATCGTGTGGT AGAAAAGATC AAACAACTTG GTGATGTTTA TACAGTAACA 3 900 

AGAGTTTGGA ACTAGAGGTG CAAAATATGA AAGTAGTTGT ACAAAGAGTT AAAGAAGCAT 3960 

CGGTGACGAA TGATACATTA AATAATCAAA TCAAAAAAGG ATATTGTTTA TTAGTCGGTA 4020 

TCGGTCAGAA CTCTACAGAG CAAGATGCAG ATGTAATTGC AAAGAAAATT GCTAATGCAA 4 080 

GATTATTTGA AGATGACAAT AATAAATTAA ACTTTAATAT CCAACAAATG AATGGTGAAA 414 0 

TACTATCAGT TTCACAATTT ACTCTCTATG CAGATGTAAA AAAAGGTAAC CGTCCAGGTT 4200 

TCTCAAATTC TAAAAATCCT GATCaAGCGG TAAAAATTTA TGAGTATTTT AATGcaTGCG 4260 

CTACGAGCGT ATGGTCTTAC TGTGAAAACA GGTGAATTTG GAACACACAT GAATGTTAGC 4 320 

ATAAATAATG ATGGTCCAGT CACTATTATT TATGAAAGTC AGGACGGCAA AATTCAATGA 4380 

45 AAAAAATAGA GGCATGGTTA TCTAAAAAGG GTCTTAAAAA TAAACGTACT CTAATAGTAG 4440 

TGATTGCCTT TGTCTTATTT ATCATCTTTT TATTTTTATT GCTGAATAGC AATAGTGAAG 4 500 

ATAGTGGGAA CATCACGATA ACTGAAAATG CTGAATTACG TACAGGTCCA AACGCTGCGT 4 560 

50 

ATCCAGTCAT ATATAAAGTT GAAAAAGGTG ACCATTTTAA AAAGATTGGT AAAGTAGGTA 4 620 

AATGGATTGA AGTTGAAGAT ACATCCAGTA ATGAAAAAGG TTGGATAGCT GGATGGCACA 4 680 
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TAGTGCTTGA TCCTGGTCAT GGAGGTAGTG ACCAGGGTGC TTCAAGCAAT ACTAAATATA 4 800 

AAAGTTTAGA AAAAGATTAT ACGTTGAAAA CAGCAAAAGA ATTGCAGCGT ACTTTAGAAA 4860 

AAGAAGGCGC AACTGTTAAG ATGACAAGAA CAGACGATAC ATATGTTTCA CTAGAAAATC 4920 

GTGATATCAA AGGCGATGCC TATTTGAGTA TACATAATGA TGCGTTAGAA TCATCTAATG 4980 

CAAATGGAAT GACaGTTTAT TGGTATCATG ATAATCAAAG AGCTTTAGCA GATACGTTAG 5040 

ACGCTACGAT TCAGAAGAAA GGTCTACTTT CTAATCGCGG TTCAAGACAA GAAAATTATC 5100 

AAGTGTTAAG ACAAACAAAA GTTCCTGCTG TTTTATTAGA ATTAGGTTAT ATTAGTAACC 5160 

CAACTGATGA AACGATGATT AAAGATCAAT TACATAGACA AATTTTAGAA CAAGCAATTG 5220 

TTGATGGCCT TAAAATTTAT TTTTCTGCGT AGGGCTTGCA AAAATATGTG AAAGTAGTTA 5280 

TCATTGATAT TGAATTTTAT AACTAAAACC GTTAGTATTC TTGAAATGGT AAATGAAATA 5340 

20 GGTAGCAATC TAACTAAGAT TGTGTAGGAA TATAATCCAT AGACTGAAAG ATTATGCTGA 5400 

GTAGTTTATA TACATTGAAC ACAAGAAGAG GTGCTTTATG AAAAGTAAAG CCGTTAAACG 5460 

TACGTTaAAC GTTTTGAGTG GGTTTATTAA ATGCACGCTT ATAAAAAGTA ATGATGATTA 5520 

25 CAATTAGGCA TGTTTTTTAA ACCA 5544 

(2) INFORMATION FOR SEQ ID NO: 111: 



10 



15 



30 



35 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1067 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

AAAAGATTGC AAATATAAAT GGCATGTTTA ATATGTTAGA ACAACAAATC ATTCATAGCC 60 

40 AAGATATGGC TCATTTTAGA AGTGAATTTT TTTACGTCAA TCATGaGCAT CGAGAAAACT 120 

ATGAAgCACT CCTAATTTAT TACAAAAATA GTATCGACAA TCCTATTGTA GATGGTGCAT 180 

GTTATATTTT AGCCCTACCT GAAATTTTCA ATAGTGTTGA TGTTTTCGAA TCAGAGTTAC 24 0 

45 CATTTTCATG GGTATATGAT GAAAATGGCA TTACCGAAAC AATGAAATCA CTTAGCATTC 300 

CATTACAATA TTTAGTTGCA GCAGCTTTAG AAGTAACTGA TGTGAATATA TTTAAGCCTT 360 

CAGGATTTAC AATGGGAATG AATAATTGGA ATATTGCTCA AATGCGAATC TTTTGGCAAT 420 

50 

ATACAGCAAT TATTAGAAAA GAAGCACTAT AACATTAATA ATTAATTAGC TATAAAGATG 4 80 

ATTCACAACA ATCATCTTTA TAGCTTTTTT ATGTCTAATT ATTTTTGAGG AAAATmACAA 540 
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AATTTTATGT TTTCAAAAGT AAACAATCAA AAGATGTTAG AAGATTGCTT CTATATAAGA 660 

AAGAAAGTGT TTGTAGAAGA ACAAGGCGTC CCTGAGGAAA GTGAAATTGA TGAATATGAA 720 

5 TCTGAATCTA TTCACCTCAT TGGATATGAT AATGGACAGC CAGTTGCCAC TGCTCGAATA 780 

CGCCCTATTA ATGAAACAAC TGTCAAAATA GAACGAGTAG CTGTGATGAA ATCACATCGT 84 0 

GGACAAGGAA TGGGTAGAAT GCTTATGCAA GCTGTAGAAT CATTAGCTAA AGATGAAGGT 900 

TTTTACGTAG CTACTATGAA TGCCCAATGT CATGCTATCC CATTTTATGA AAGTTTAAAC 960 

TTTAAAATGA GAGGTAATAT ATTTCTTGAG GAAGGCATCG AGCATATTGA AATGACAAAA 1020 

AAGTTAACCT CGCTTAATTA AAAAAAGTTG TATCTATTTT AGAAACA 1067 

15 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18613 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 



30 



35 



40 



50 



AAGACGtAtG 


ATAACAACAA 


TACgTGTAGT 


GAAAGATTTT 


AATCTACATA 


TTACTGACAA 


60 


AGAATTCATT 


GTATTTGTTG 


GACCATCGGG 


ATGTGGTAAA 


TCAACAACAT 


TACGAATGGT 


120 


TGCTGGACTA 


GAGTCTATCA 


CATCTGGAGA 


TTTTTATATT 


GATGGGGAAC 


GCATGAACGA 


1B0 


TGTTGAACCA 


AAGAATAGAG 


ATATTGCGAT 


GGTATTTCAA 


AACT ATG CAT 


TATATCCACA 


240 


TATGACTGTT 


TTTGAAAATA 


TGGCATTTGG 


GCTAAAGCTA 


CGTAAAGTAA ATAAAAAAGA 


300 


GATTGAACAA 


AAAGTTAATG 


AAGCAGCTGA 


AATATTAGGA 


TTAACTGAGT 


ATCTTGGTCG 


360 


TAAACCAAAA 


GCGTTATCTG 


GCGGACAGCG 


TCAACGTGTT 


GCTTTGGGCA 


GAGCTATTGT 


420 


TAGGGATGCG 


AAAGTCTTTT 


TAATGGATGA 


ACCATTATCG 


AATCTTGATG CGAAyTtCGA 


480 


GTACAAATGC 


GCACAGAAAT 


ATTGAAATTA 


CATAAGCGAC 


TTAATACTAC 


GACAATTTAT 


540 


GTTACACATG 


ATCAAACTGA 


AGCATTGACG 


ATGGCTAGTC 


GAATTGTTGT 


TTTGAAAGAT 


600 


GGCGACATTA 


TGCAAGTCGG 


CACACCTAGA 


GAAATATATG 


ATGCCCCTAA 


TTGCATATTT 


660 


GTGGCGCAAT 


TTATCGGCTC 


ACCAGCAATG 


AATATGTTGA 


ATGCTACAGT 


TGAAATGGAC 


720 


GGATTGAAGG 


TAGGAACACA 


CCATTTTAAA 


TTACATAATA 


AAAAATTTGA 


AAAGTTAAAA 


780 


GCTGCTGGCT 


ACTTAGACAA 


GGAAATTATT 


TTAGGTATTC 


GAGCTGAAGA 


CATTCATGAA 


840 


GAACCAATAT 


TTATTCAAAC 


TTCTCCAGAG 


ACACAATTTG 


AATCTGAAGT 


AGTTGTATCC 


900 
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AAATTAGATT CAAGAACTCA AGTGATGGCG AACGACAAGA TTACACTAGC ATTTGATATG 1020 

AATAAGTGTC ACTTTTTTGA TGAAAAAACA GGAAATCGTA TCGTCTAAGG GGGAGTATTC 1080 

ATGTCTAAAA TTTTAAAATG TATCACGTTA GCCGTGGTAA TGTTATTAAT CGTAACTGCA 114 0 

TGTGGCCCTA ATCGTTCGAA AGAAGATATT GATAAAGCAT TGAATAAAGA TAATTCTAAA 1200 

GACAAGCCTA ACCAACTTAC GATGTGGGTG GATGGCGACA AGCAAATGGC GTTTTATAAA 1260 

AAAATTACGG ATCAATATAC TAAAAAAACT GGCATCAAAG TAAAGCTTGT AAATATTGGT 1320 

CAAAATGATC AACTAGAAAA TATTTCGCTA GACGCTCCTG CAGGAAAAGG TCCAGATATC 1380 

TTTTTCTTAG CACATGATAA TACTGGAAGT GCCTATCTAC AAGGCTTAGC TGCTGAAATC 1440 

AAATTATCAA AAGATGAGTT GAAAGGTTTC AATArGCAAG CACTTAAAGC GATGAATTAT 1500 

GACAATAAGC AACTAGCATT GCCAGCTATC GTTGAAACAA CCGCACTTTT TTATAATAAA 1560 

20 AAATTAGTGA AAAATGCACC GCAAACGTTA GAAGAAGTTG AAGCTAATGC TGCCAAACTA 1620 

ACTGATAGTA AAAAGAAACA ATACGGTATG TTATTTGATG CTAAAAATTT CTATTTTAAT 1680 

TATCCGTTTT TATTCGGCAA TGATGATTAT ATTTTCAAGA AAAATGGCAG TGAATATGAT 1740 

25 ATTCATCAGC TAGGACTAAA TTCAAAACAT GTCGTCAAGA ATGCTGAACG ATTACAAAAA 1800 

TGGTACGACA AAGGGTATCT TCCTAAGGCA GCAACACATG ATGTCATGAT TGGTCTTTTT I860 

AAAGAAGGAA AAGTAGGACA ATTTGTCACT GGACCGTGGA ACATTAATGA ATATCAAGAA 1920 

ACGTTTGGTA AAGATTTAGG AGTAACAACA TTACCTACAG ATGGTGGCAA ACCTATGAAA 1980 

CCATTTCTAG GTGTACGTGG TTGGTATTTA TCTGAATATA GTAAACATAA GTATTGGGCT 2040 

AAAGATTTAA TGCTGTATAT CACTAGTAAA GATACATTAC AAAAATATAC AGATGAAATG 2100 

AGCGAAATTA CTGGACGTGT TGACGTGAAA TCATCTAATC CAAATTTAAA AGTGTTTGAA 2160 

AAGCAAGCAC GTCATGCTGA ACCGATGCCT AATATTCCTG AAATGCGACA AGTTTGGGAA 2220 

CCGATGGGCA ATGCAAGCAT ATTTATTTCA AATGGTAAGA ATCCTAAACA AGCGTTAGAT 2280 

c 

GAGGCGACGA ATGATATAAC GCAAAATATT AAGATTCTTC ATCCATCACA AAATGATAAG 2340 

AAAGGAGATT AGTTATGACG AAACGTAACC CTAAATTAGC GGCATTATTA TCTGTTATAC 2400 

45 CTGGTTTGGG ACAGTTTTAT AATAAAAGAC CCATTAAAGG GACGATATTT TTTATCTTTT 2460 

TCATCAGTTT TATTTCTGTT TTTTATAGCT TTTTAAATAT TGGTTTTTGG GGATTGTTCA 2520 

CATTAGGGAC AGTACCTAAG TTAGACGATT CTCGTGTCTT ACTTGCACAA GGTATTATTT 2580 

50 

CTATCTTACT CGTTGCTTTC GCAATCATGC TATATATCAT TAATATTTTA GATGCATATC 264 0 

GTAATGCTGA ACGATTTAAT CGCAATGAGG AAATAAAGGA TCCGAAGcGC GTATGGTGGC 2700 
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TGTAGTTGTA TTTCCATTAA TAyyTATGTT TGGAGTAGCA TTTACAAATT ACAATTTATA 2820 

CAACGCGCCT CCGAGACACA CATTAGAATG GGTTGGTTTA GATAACTTTA AAACGTTATT 2880 

CACAATTGGC GTTTGGCGTA AAACATTTTT CAGTGTTATT ACTTGGACAT TAGTATGGAC 2940 

GCTTGTTGCA ACGACACTTC AAATTGCATT AGGGCTGTTT TTGGCAATTA TTGTAAATCA 3000 

CCCTGTCGTC AAAGGTAAGA AATTTATCCG TACTGTGTTA ATCCTACCTT GGGCTGTACC 3060 

ATCATTTGTG ACAATTTTAA TATTTGTAGC GTTATTTAAT GATGAATTTG GTGCGATAAA 3120 

TAATGATATT TTGCAACCTT TATTAGGTGT AGCACCAGCA TGGTTAAGTG ATCCGTTTTG 3180 

GGCAAAAGTG GCATTAATCG GCATTCAAGT ATGGCTTGGA TTCCCATTTG TCTTTGCACT 324 0 

GTTCACTGGA GTACTGCAAA GTATTTCATC AGATTGGTAC GAAGCAGCAG ATATGGATGG 3300 

TGCGTCTAGT TGGCAAAAGT TTAGAAACAT CACATTCCCG CATGTCATTT ACGCCACAGC 3360 

20 GCCATTGTTA ATTATGCAAT ATGCAGGTAA TTTCAATAAT TTTAATCTTA TTTATCTATT 3420 

TAATAAAGGC GGTCCACCAG TGTCAGGGCA GAATGCTGGT AGTACAGATA TCTTGATATC 3480 

TTGGGTGTAT AATCTGACAT TTGAGTTTAA CAACTTCAAC ATGGGTGCAG TTGTGTCATT 354 0 

25 AATTATTGGA TTTATTGTTG CTATTGTCGC ATTTATTCAA TTCAGACGTA CAAGTACGTT 3600 

TAAAGATGAG GGAGGTTTAT AAGATGACAA AGAAGAAAAA CATATTAAAA GCAATCGGTA 3660 

TTTACAGTTT TATAGCGATG ATGTTTGTCA TCATTTTATA TCCACTACTG TGGACATTTG 3720 

GCATTTCCCT TAATCCAGGT ACGAACTTGT ATGGTGCCAA AATGATACCA GACAATGCAA 3780 

CATTTAAAAA TTATGCATTC TTACTATTCG ATGACAGTAG TCAATACCTG ACTTGGTATA 384 0 

AAAATACGCT TATCGTAGCA TCTGCAAATG CACTGTTTAG TGTGATATTT GTCACGTTAA 3900 

CAGCATATGC TTTTTCTAGA TATCGCTTTG TTGGTCGTAA ATACGGGCTG ATTACATTTT 3960 

TGATTTTACA AATGTTCCCT GTATTAATGG CAATGGTCGC AATCTATATT TTGCTAAATA 4020 

CAATTGGATT ATTAGATTCT TTATTTGGAC TAACACTGGT ATATATTGGT GGATCAATAC 4 080 

CGATGAATGC CTTTTTAGTG AAAGGTTACT TCGATACGAT TCCAAAAGAA CTTGATGAAT 414 0 

CTGCCAAAAT TGATGGTGCA GGGCATATGC GTATTTTCTT ACAAATTATG CTTCCATTAG 4200 

45 CTAAGCCGAT TTTAGCAGTT GTTGCTTTGT TCAATTTTAT GGGGCCATTT ATGGACTTTA 4260 

TATTACCTAA AATACTATTA AGAAGTCCTG AAAAATTCAC ATTAGCAGTT GGATTGTTCA 4320 

ACTTTATTAA TGATAAGTAT GCAAATAATT TCACAGTGTT TGCAGCAGGG GCAATTATGA 4380 

50 

TTGCAGTACC TATAGCAATC GTATTCTTGT TCTTGCAACG CTATTTAGTA TCAGGTTTAA 4440 

CAACAGGTGC GACAAAAGGT TAGTTTGAAA TTAGGAGTGG GGCAGAATTG ATAAAGAACC 4500 
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GGGTGTGGTG GTATTGCGAA TGGCAAGCAC ATGCCAAGTT TACAAAAAGT TGAAAATGTT 4 620 

GAAATGATCG CATTTTGTGA CGTAGACATT TCGAAAGCAG CGAGTGCGGC AGAAGCATAC 4 680 

GGAACTGACA ATGCAAAGGT TTATGATGAT TACAAAGCAT TGTTAAAAGA TGACACGATT 4740 

GATGTTATCC ATGTTTGTAC GCCAAATGAC TCGCATTGTG AAATTACTGT AGCAGGGTTG 4800 

CATGCTGGTA AACATGTGAT GTGTGAAAAA CCAATGGCTA AAACGACAGC AGAAGCTCAA 4860 

AAAATGATAG ATACAGCTAA ATCAACAGGT AAAAAATTAA CAATAGGTTA TCAAAATCGT 4 920 

TTCCGAGCAG ATAGTCAATT TTTACATCAA GCAGCGCAAC GTGGCGACTT AGGAGACATT 4 980 

TACTTCGGAA AGGCACATGC CATTCGTCGT CGAGCAGTAC CAACATGGGG TGTCTTTCTA 5040 

GACGAAGAAG CTCAAGGTGG AGGACCATTA ATCGATATCG GTACACACGC TTTAGATTTA 5100 

ACGTTATGGA TGATGGATAA TTATGAACCA GAATCAGTGA TGGGTTCAAC ATTCCATAAA 5X60 

20 TTAAATAAAC AGCATCATGC GGCAAACGCT TGGGGTTCAT GGAATCCAGA TGAATTTACA 5220 

GTTGAAGATT CTGCGTTTGG ATTTATTAAA ATGAAGAATG GAGCGACGAT CATTTTAGAA 52 80 

TCCGCTTGGG CGATTAATTC TTTAGAAGTG GATGAGGCAA AATGTTCATT ATCAGGAACT 5340 

25 AAAGCAGGTG CTGATATGAA AGATGGTCTA CGTATTCATG GTGAAGACAT GGGTACACTT 54 00 

TATACCAAAC ACGTTGAATT GGAAAACAAA GGCGTCGACT TTTATGAAGG TAATGAAGTG 5460 

GATGAAGCTG AAGAAGAAGC AAAAGCTTGG ATTGATGCAG TTGTAAATGA TACTGAACCA 5520 

GTTGTGAAAC CGGAACAAGC AATGGTAGTT ACAAAAATTC TTGAAGCGAT TTATCAGTCT 5580 

GCAAAATCAG GCAAAGCAAT TTACTTTGAA TAACATCATA CGGTAAGGAG GCACATCATG 5640 

ACAAAATTAA AAGTTGGTGT GATAGGTGTT GGTGGTATTG CACAAGACCG TCATATTCCA 5700 

GCATTGCTGA AACTCAAAGA CACAGTCTCA TTAGTTGCAG TACAAGATAT TAATACAGTG 5760 

CAGATGATTG ATGTTGCGAA gCGCTTTAAT ATACCTCATG CAGTTGAGAC ACCTAGCGAG 5820 

40 CTGTTTAAAC TTGTTGATGC GGTGGTCATT TGTACACCTA ATAAATTCCA TGCTGATCTT 58 80 

TCTATAGAAG CATTGAACCA TGGTGTCCAT GTATTGTGTG AAAAGCCAAT GGCGATGACG 5940 

ACGGAAGAGT GTGATCGCAT GATTGAAGCG GCTAATAAAA ATCACAAATT ATTAACTGTC 6000 

45 GCATATCATT ATCGTCACAC AGATGTGGCA ATTACTGCTA AAAAAGCAAT TGAATCAGGT 6060 

GTGGTTGGTA AACCTTTAGT AGCACGTGTA CAAGCGATGC GTAGGCGTAA AGTGCCTGGC 6120 

TGGGGTGTTT TTACCAATAA AGCGTTGCAA GGTGGCGGTA GTTTAATCGA TTATGGTTGC 6180 

50 

CACTTGTTAG ACTTATCTTT GTGGCTACTA GGTAAAGATA TGGTGCCGCA TGAAGTGCTA 6240 

GGAAAAACAT ATAATCAATT GAGCAAACAA CCGAATCAAA TTAATGATTG GGGAACATTT 6300 
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GCAAGCATGC AGTTTGAATG . TTCGTGGTCT GCAAATATCA AAGAAGATAA GGTTCACGTT 6420 

AGTTTATCAG GAGAAGATGG CGGTATCAAT TTATTTCCAT TTGAAATATA TGAGCCCCGC €480 

TTTGGAACTA TTTTTGAAAG CAAAGCTAAT GTTGAGCATA ACGAAGACAT TGCTGGTGAG 6540 

AGACAGGCGC GTAACTTTGT CAATGCGTGT TTAGGGATAG AAGAGATTGT GGTGAAACCG 6600 

GAAGAAGCAC GCAATGTAAA TGCCCTTATA GAAGCGATTT ATCGTAGCGA TCTTGATAAC 6660 

AAGAGCATAC AACTTTAATG ATTATCATAT ATGATACAAA ATTCTCAATA TAAAAAGAAG 6720 

GAGTGCTTTT CAATGAAAAT AGGTGTATTT TCAGTATTAT TTTACGATAA AAATTTTGAA 6780 

GATATGTTAG ATTATGTCTC AGAATCTGGA TTGGATATGA TTGAAGTTGG AACAGGTGGT 6840 

AACCCAGGAG ATAAATTTTG TAAGTTAGAT GAGTTGTTAG AAAATGAAGA CAAGCGCCAA 6900 

GCATTTATGA AGTCAATCAC AGACAGAGGC TTACAAATAA GTGGTTTCAG TTGTCATAAC 6960 

20 AATCCAATTT CTCCAGATCC GATAGAAGCG AAAGAAGCCG ATGAAACGTT ACGTAAAACA 7020 

ATCCGTTTAG CAAATCTATT AGACGTGCCA GTTGTTAATA CATTTTCTGG CATTGCAGGA 7080 

TCAGATGATA CCGCTAAAAA GCCTAATTGG CCTGTTACAC CTTGGCCAAC AGCCTACTCT 7140 

GAAATTTATG ATTATCAGTG GAATGAAAAG TTGATACCAT ATTGGCAAGA TTTAGCTGAG 7200 

TTTGCAAAAG AGCAAGATGT AAAAATTGCC ATAGAGTTGC ATGCAGGATT TTTAGTGCAT 726 0 

ACACCATATA CAATGTTGAA GTTACGTGAG GCTACAAATG AATATATCGG TGCTAACTTA 7320 

GATCCTAGTC ATCTATGGTG GCAAGGTATT GACCCAATTG CTGCGATTCG CATATTAGGC 7380 

CAAGCAAATG CAATTCATCA CTTCCATGCT AAAGATACGT ATATTAATCA AGAAAATGTA 744 0 

AATATGTATG GTCTAACTGA TATGCAACCA TATGGTAACG TTGCGACAAG AGCATGGACA 7500 

TTCCGTACAG TTGGTTATGG ACATAGTCCA TATGTATGGG CAGATATCAT AAGTCAACTT 7560 

ATTATTAATG GATATGATTA TGTATTAAGT ATTGAACATG AAGATCCTAT TATGTCAGTA 7620 

GAAGAAGGTT TCCAAAAAGC TTGTCAAACT TTGAAATCTG TTAATATTTA CGACAAGCCA 7680 

GCAGACATGT GGTGGGCATA ATACGAACTC GAGGTTAGTC TGAAGTTTGT CTGAAGTAAG 7740 

ACTGGTGGCA GTGTTGAATA AATGCATATG TCGCCAAGCC ATTGCCAAAA ATTTCACACC 7800 

45 TTAAATCAAG TCATTGTTTG TAAAGAAGGT GTACTTTATA TAAGTATATA GCGATGGTCA 7860 

TACCCATTCA CAGTAACAAT CCTCACCATT GAAAAGAGTA TATAACCTTT TCAATAGTGA 7920 

GGTATATGAT AATAAAAAAA GCCTGTTGTC ACAATGGTCA TAGACACGAC ATACTTTAAA 7980 

50 

GGTTTCTGAA TATAATATTT CAGAATGCAC TTTAAAGATG GACGTCGATG TAGACTAAAG 804 0 

TGATGACAGG CTTTCATCTT TTTAAATATT CATTAATTTC TCTTCTTGTT TAATACGTAC 8100 
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TAATACACCG ATTAATTCAG GAATGATGTT TAAGAAGTAA TTTGGGTGTT TTGTAATTTT B220 

ATATAATCCA GATTTAATAA TAGGATGGTT AGGTAAAATG AATAATTTTA ATGTCCAAAT 8280 

ACCACCTAAA GTTTTAATAA CCATAAATAA CATGATATAA GCAAAGATTA ATATAACTAA 834 0 

GCCAATACCA TTTGCAAAGC TAAATGTATC TTTATTAATA AATGCCTCTA CACCAGCCAA 8400 

TACATAAATT AAAACGTGTG TTATTGCTAA AAACTTCGAA TTTTTAACGC CATATTCAAC 8460 

TGCACCGTCT GCTTTTAATT GTTTTGAGTG ATTAATAGAT ATCTTTAAGC TGACAAGTCT 8520 

GATACAGAAA AAGATAAGTA ATATAGATAG AATCATGATG TCCTCCGTCA TTATGTCATA 8580 

TGTATAAGCG TTGATTTTGA CAACATAAAG TATTTTATAG ATAAAGCTTG TCAAATACTA 8640 

TTAACTATTT ATTAATTTTA GTACATAAAT ATGTTTCTAA GTATGTGTTT ATGTTCAGTA 8700 

TTTTGGATAA TTTAATAATT TTAAGGATAT TAAGCGCTTA CACCGACGTG ATATATTTGG B760 

20 CTTAACGAAA ATGATTGAGG TGACAGAGAT GAACTTTTTT GATATCCATA AGATTCCGAA 8820 

CAAAGGCATT CCATTATCGG TACAACGTAA ATTATGGCTT AGAAACTTCA TGCAAGCTTT 8880 

CTTCGTAGTG TTCTTTGTTT ATATGGCTAT GTATTTAATT CGAAACAACT TTAAGGCGGC 894 0 

25 ACAACCGTTT TTAAAAGAGG AAATTGGATT ATCTACATTA GAACTTGGTT ATATCGGATT 9000 

AGCATTTAGT ATCACGTACG GTTTAGGAAA AACATTACTT GGATATTTTG TCGATGGACG 9060 

TAACACAAAA CGTATTATCT CGTTCTTACT TATCTTATCT GCGATTACAG TTTTAATTAT 9120 

GGGATTTGTT TTAAGTTACT TTGGTTCTGT AATGGGATTA TTAATTGTAC TTTGGGGACT 9180 

TAACGGGGTG TTCCAATCAG TTGGTGGACC TGCAAGTTAT TCAACGATTT CAAGATGGGC 924 0 

GCCAAGAACG AAACGTGGCC GATACTTAGG ATTCTGGAAT ACATCACATA ATATCGGTGG 9300 

TGCCATAGCA GGTGGTGTTG CACTTTGGGG TGCTAATGTA TTCTTCCATG GAAATGTTAT 9360 

AGGGATGTTC ATTTTCCCAT CGGTGATTGC ATTACTTATT GGTATCGCAA CATTATTTAT 9420 

40 CGGAAAAGAT GATCCGGAAG AATTAGGATG GAATCGTGCT GAAGAAATTT GGGAAGAGCC 9480 

GGTCGATAAA GAAAATATTG ATTCTCAAGG TATGACGAAA TGGGAGATCT TTAAAAAATA 954 0 

TATCCTGGGA AATCCTGTTA TATGGATTCT ATGTGTTTCA AACGTCTTTG TATACATTGT 9600 

45 ACGAATCGGT ATTGATAACT GGGCACCGTT ATATGTGTCA GAGCATTTAC ACTTTAGTAA 9660 

AGGCGATGCA GTTAATACGA TATTCTACTT TGAAATTGGT GCATTAGTTG CAAGTTTATT 9720 

ATGGGGCTAC GTATCAGACT TATTAAAAGG TCGTCGTGCA ATTGTAGCTA TTGGCTGTAT 9780 

SO 

GTTTATGATT ACATTTGTTG TCTTATTCTA CACAAATGCT ACAAGTGTCA TGATGGTTAA 984 0 

CATTTCATTG TTTGCATTAG GTGCGTTAAT CTTTGGTCCG CAATTATTAA TTGGTGTATC 9900 
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CGCGTATCTA TTCGGTGACT CAATGGCGAA AGTTGGTTTG GCGGCTATTG CTGATCCAAC 10020 

ACGTAACGGT TTAAACATCT TTGGATATAC ATTAAGTGGA TGGACAGATG TTTTCATCGT 10080 

CTTCTATGTT GCATTATTCC TAGGCATGAT TCTATTAGGA ATCGTTGCTT TCTATGAAGA 10140 

AAAGAAAATT AGAAGTTTAA AAATTTAATA TAAATCGGAT TAAAAGTATC GCCAATCTAT 10200 

TGCAATATAG TTGGCAATCC TGCCCCGACG GCATGTGCGT GAAGAGATGA AAGATACTGC 10260 

TTCTACCCTT GCAAATATAT CATCTCTATG TCTCGGGGCA GATCATAATT CCCTGTTATG 10320 

AAGTATCCTT ATTTGCCCGA CTTAGGGTGA CTCAATGAAT TTACTCCTTA CAATAAAGAC 10380 

ATATAGOGGT GTCAATATTG TAGGGAGTAT TGTTTTATAT TTAAACTCTC TAAAAAGCGG 10440 

ACTGAAAGAA AAGTGAAAAC TTCTCTATCA GTCCGCTTTT TCATAGAACA AAATGGAGGC 10500 

GCCATAATCA TTAGTTATGT GCTAATCTAT TTTGCTTGCT TACAATAATC ACTTGGCGAC 10560 

ATTTGTAAAT ATTTTTTAAA ATGATAGCTA AACATTTTAT ACTCTGAAAA GCCTACTTTG 10620 

TCTGCAATTT CATAGTGTTT GTAATGTCGA TCTAACAATT GCAGAGATTG TAAAATACGA 10680 

TAGCGATTTA AATAATCGAC AATTGTAATA CCAACATGAT CTTTAAATGT TCGCATCGCA 10740 

. TACGATTCAC TAACATCGAT ATGTTGAATT AAATCTGAAA CAGtCACTTT CGTTTGATAA 10800 

GATTGCTTAA TTTGATCCAC AATCTGGTTT ACATAATAAT CATCGTATTC TACTTTTAAT 10860 

AGTGGTTGGA AGGCATCATG ACAAGATGCT AAGCTACGGC CGTTCTGTGA TTGTTGCTCT 10920 

AATAAGGTAC GGACAAGTCT TCCTAAAATA ACTTCTAATT GTGCATGGTC TACTGGTTTT 10980 

AATAAATAAT CAAGAACATG ATGTTGAATG CCGGCTTTCA TATATTCAAA GTCATCGTAA 11040 

CTCGATAATA TGATGAGATT ACAATCTAGA TGCGCAATAT CATTGAGTAA ATCGACGCCA 11100 

TTTTTACGTG GCATACGAAT ATCAGTAATT ACTAATTCTG GCTGATGTTG TTGAATTAGT 11160 

GATAATGCTT CAACACCATC TTTAGCAGTG TATATTGTAT TGAAATGATA GTCTCCCCAA 11220 

GGAATGATTT GCTTTAATCC TTCTCGAATA ATTCGTTCAT CATCACAAAT AACTACCTTA 11280 

AACATCTACA TTCCCCCTTG AAAGTGGTAT TTTATAACAA ATTAACGTAC CTTGATTACG 11340 

CTTTGAAAAA ATATGGAGTC GTGCATGTGA ACCATATTGA ATCATTGCTT TATTGTGTAA 11400 

ATGATTTAAT CCCAAATGCT TAGTATCAAA TACATCATTA TTAAGAGATT GGCGTACATA 114 60 

TTGCAGGCGA GATGACGACA TCCCGATACC ATTGTCGCAA ACTAAAACAT GTAAATTCTG 11520 

ACGTGCCAAT GTCAGGCGTA TAGTAATGTC CAATGACTCA GTATCTCTAC CATGTTTAAT 11580 

AGCATTTTCT ATGAGTGGCT GAAGCATCAT TTTACCAATT GTCTGGTGAC GCGCTTCTTC 11640 

AGAACTTTCA ATATGGAGCT TAATCATGTC ATCAAAACGG aTGTTTTGTA TTGCAACATA 11700 
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GTAACGTAAC ATTTGCGATA ATTGTTGGAC CACAGTTtGT GCTAATTTCG GAGATAACGT 11820 

AATTAAATAT TGTATTGTTT GCATCGTATT GAATAGGAAA TGAGGCTGGA ATTGGCGTTC 11880 

TATTTCCTTT AACTGAATAT CACGCAAGCG ACGTTCTGTA TGCTCGATAG AATGGATCAG 11940 

TTGCTCATTT GATTCAAATA AATCGTAAAT ATAATTATTA ATTTCTTCTA GTTCACTGTT 12000 

GTTTTTTAAA GGCGTATATG TACCTAGATG ACGATTTTTG GCATAGTAAA TTTTTTGAAT 12060 

AATCGTTTCG ATATCTTTTG TTTGTCGTTT AGCCATATTA TCTGCGCTAA TGAAACCAAA 12120 

TATTACTAGT AAAACAAGAA CTACGGCCAT AACAATTAAC AACGTGATAC CATCTTCAAT 12180 

GTTTTCATGT ATATCTTTAT AAATAATGAG ACGATGGTCA GCATGGTTTA ATTTTACAGA 12240 

TTCATTCATA AATCCGAATT GTTGTGGTcT ATACTTTTCA CCTATAGTAA AACGGTGATC 12300 

GTTGGCGTAT AAAATATTGT CATATTGATC AmCGATAAGT GCGAATTGTC GGTTATCTTT 12360 

CtTAATTTCA CTTAAACGTG GGGTGTtAGC CATATAAATt TTaAGCATAT ATGTACTATT 12420 

TTTGAATTTA AGCTGATGCG TTGAAAATAA ATACATATTT TTAGTGTTTA AATGTTCATA 124 80. 

ATTATTGGTT ATAAACTGAT TTGGTCCAGA TAATTCATAA TAAAGTGTTG CGGGCTGTTG 12540 

GkGTATTAAT TTTAATAATT CACGTTTTGT AGCGGTCACA TCATGATGAT TTGyTAAATC 12600 

GAGCTCTTGA AACGAATTAT TATGCTGTGT AATAAATGTC TGAATCTGCT TTTCAGTATG 12660 

ATGTAAAGAT GACTGACTTT CATCAACATG TTGATGAATC GTACGATGCT CAATCCAAAT 12720 

ATAGATGGCA TAGAAGCTTA CTAGTCCAAT AATAATGACT AAAAATACTG GAAAAATAGT 12780 

AGACnCAAAT AACGATCGTC TTAATTGATG TCTATAAGGT TTGTATGCCn TCATTGAATC 12840 

ATCTCCAAAA ATTTATGATG TGGAATATCC GGTAATTTAG ATTTCGGTAT TAAAGGTATG 12 900 

TTCTTAAGAT TTTCGATAGA CTGATCGCTT TGTTCACTAA CATCCTTTCG AATTGACTTG 12960 

GCAT-CGAACT CTGCAACTAA TCGTtGTTGT ACTGAGCGGC TTGTTAAATA TTGCACTAAC 13020 

TTTTTACGCT TAGGATGAGG GTGTGCATTT TTAACTAAAG CAATrCCATC AACATTTAAC 13080 

ATTGTTCCTT CAATTGGATA AACGATTGAT ACAGGATAAC CTTTGTTTTT CCATGTGCGT 13140 

GCATCTTGTT CGTAGCTTAG ACCTGCGTAA TATTTACCTT TTGCAACATC TTCAATGACT 13200 

TTAGACGTCT TTGACAGTTG CATCGCATGG TTTTGGAATT GATGCACATC ACTTACTCGA 13260 

TGATGCATGC TATAAATAGC ACGCATATGT TGATAGCCTG TCGTTGTTGT ATTTGGATTT 13320 

GAGTACGCAA TTTTACCTTT AAGTATAGGT TGTAATAAAT CTTGATAACC TCGAATCTTA 13360 

ATATCTCCTT GTAAATCTGA ATTCACTACT ATAACTGTTG GCATTAATAG AAAACTAGTA 13440 

ACATATTTAT TGTTCGAGCG ATAATCCTCT AATTGCTGTG TTACAGATGT ATCTTGATAG 13500 
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CCACGCTCCG AAAAATCTTC GTTATGCAAG 
TTAATTTCAA TTTTGACATG CTCTTGTTTT 
TTTGATTGAT ACGGAGAATA AACTGTTAAT 
TTAGCGCATG CTGaTAAAAA AATGAGAAAT 
ATATCCCATC AATTCTATGT ATATTTTAAT 
GTAATGTTAA ATATTTAGAA ATGTTTATAG 
ATAGCACAAA ATTTTTGTTT GTCAAGACGA 
ATTTTATTTG TAGCTGTTAT ATAAAAATCG 
TTTACGTCAA TAAAAGTATT TAATCCAGTC 
TGTTGATTTA ACGCTTATCA ACAATCATTT 
CTTTTAAAGC AATGAAAATA GTGAACATTA 
TATTACTGTT ACACAAATTA GTACAGTTTC 
TACATAATTT ATGTGAAAAA AATCACAACA 
CATAGCATTT CAAATTCACA ACATTATACA 
TaAAAATCAT GCAACAGCTT GGCAAGGATT 
TGTAAGAGAG TTTATCCAAT TAAACTACAC 
AGGACCAACA GAAGCAACTT CTAAACTTTG 
ACGTGAACGT GGCGGCATGT GGGATATGGA 
TGATGCTGGT TATTTAGACA AAGATTTAGA 
ATTCAAACGT TCAATGCAAC CATTCGGTGG 
TTAtpGGTTAC GAATTAGACG AAGAAACTGA 
TAACCAAGGT GTATTCGATG CAT ATT CT AG 
AATCACTGGT TTACCTGATG CATACGGACG 
AGCTTTATAT GGTGTAGATT TCTTAATGGA 
TACAGAAATG TCAGAAGATG TAATTCGTTT 
ATTAAAAGAA TTAAAAGAAC TTGGACAAAA 
AAACTTCAAA GAAGCAGTTC AATGGTTATA 
AAACGGTGCA GCAATGAGTT TAGGTCGTAC 
TGACCTTAAA GCAGGCGTTA TTACTGAAAG 



TTTGAAAGCA GTACTTGAGT 
TCAAATTCAT TTAAAATTGG 
ACATTTTTAT CGGATTCAGA 
AATAGCAAGA TATAAATTTT 
ACAATAATTT TAGCAATAAA 
ATGACTTGTT AAGACGTTGC 
TTTACCGAGG CTGTAAAATC 
GCAAGATATT GAACGGTTCA 
TCTTCATATA TAAAAGTAAA 
TTTATAAACA AATATATACT 
TAACTGTTGT GTAACAGAAT 
TATGTTTTGA CATACATTTG 
AACATGCTAC AATGACTATG 
GATGGAGGCG TTTAGTATGT 
TAAAAATGGA AGATGGAACA 
TCTTTATGAA GGTAATGATT 
GGAACAAGTA ATGCAGTTAT 
CACGAAAGTA GCTTCAACAA 
AACAATTGTA GGTGTACAAA 
TATTCGTATG GCGAAAgcAG 
AAAAATCTTT ACAGATTATC 
AGAAATGTTG AACTGCCGTA 
TGGACGTATT ATCGGTGACT 
AGAAAAAATG CACGACTTCA 
ACGTGaAGAA TTATCAGAAC 
ATATGGTTTC GATTTAAGCC 
CTTAGCATAC CTTGCTGCAA 
ATCAACATTC TTAGATATCT 
CGAAGTTCAA GAAATTATTG 



AGATCCGTGT 
ACGAATCAAG 
GTGACGCGTA 
TGATTTCATG 
TGACGCATAA 
AAATGTTGTG 
AAACTGTTAT 
AAAGTGAATT 
TCTTTCTAAG 
CCTAAATTAA 
GCAATTAGCA 
ATGAAAATTG 
AAAACGTTAA 
TAGAAACAAA 
GACACGTAGA 
CATTTTTAGC 
CGAAAGAAGA 
TCACATCTCA 
CTGAAAAGCC 
CTTGTGAAGC 
GTAAAACACA 
AAGCAGGTGT 
ATCGTCGTGT 
ACACGATGTC 
AATATCGTGC 
GTCCAGCAGA 
TTAAAGAACA 
ATGCTGAACG 
ACCACTTCAT 



13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
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AGACCCAACT 
TACGAAAAAC 
AAACTTAACA 
AATGAGTATT 
TGGCGATGAC 
ATTCTTCGGT 
AGATGAAAAA 
AGAATATGAC 
CATTAACTCA 
AATGGCATTA 
AGTAGCAGCT 
AGAAGGTCTT 
CGACCGTGTA 
TCATAAAACA 
TGTATACGGT 
TCCAGGTGCA 
TGTAGCTAAG 
ACCAAAATCA 
TGGTTACGCA 
AATAGATGCA 

cgctSttaac 
ccatgaaagt 
ctgtcgaaag 
gatgcttact 
caagagaagt 
atgcatcggg 
tagaaaaatt 
ctggatgtgc 
cagacttgat 



TGGGTAACTG AATCTATCGG TGGTGTAGGT ATTGACGGAC GTCCACTTGT 15420 

TCATTCCGTT TCTTACACTC ATTAGATAAC TTAGGTCCAG CTCCAGAACC 15480 

GTATTATGGT CAGTACGTTT ACCTGACAAC TTCAAAACAT ACTGTGCAAA 15540 

AAAACAAGTT CTATCCAATA TGAAAATGAT GACATTATGC GTGAAAGCTA 156 00 

TATGGTATCG CATGTTGTGT ATCAGCGATG ACAATTGGTA AACAAATGCA 15660 

GCACGTGCGA ACTTAGCTAA AACATTACTT TACGCTATCA ATGGTGGTAA 15720 

TCTGGTGCAC AAGTTGGTCC AAACTTCGAA GGTATTAAGA GCGAAGTATT 15780 

GAAgTATTCA AGAAATTTGA TCAAATGATG GATTGGCTAG CAGGTGTTTA 15840 

TTAAATGTTA TTCACTACAT GCACGATAAA TACAGCTATG AACGTATTGA 15900 

CATGATACAG AAATTGTACG TACAATGGCA ACAGGTATCG CTGGTTTATC 15960 

GACTCATTAT CTGCAATTAA ATATGCACAA GTTAAACCAA TTCGTAACGA 16020 

GTAGTAGACT TTGAAATCGA AGGCGACTTC CCTAAATACG GTAACAATGA 16080 

GATGATATTG CAGTTGATTT AGTAGAACGC TTCATGACTA AATTACGTAG 16140 

TATCGTGATT CAGAACATAC AATGAGTGTA TTAACAATTA CTTCAAACGT 16200 

AAGAAAACTG GTAACACACC AGACGGACGT AAAGCTGGCG AACCATTTGC 16260 

AACCCAATGC ATGGCCGTGA CCAAAAAGGT GCATTATCTT CATTAAGTTC 16320 

ATCCCTTACG ATTGCTGTAA AGATGGTATT TCAAATACAT TCAGTATCGT 16380 

TTAGGTAAAG AACCAGAAGA TCAAAACCGT AACTTAACTA GTATGTTAGA 16440 

ATGCAATGTG GTCACCACTT AAATATTAAC GTATTTAACC GTGAAACATT 16500 

ATGGAACATC CAGAAGAATA TCCACAGTTA ACAATCCGTG TATCTGGTTA 16560 

TTCATTAAAT TAACACGTGA ACAACAATTA GATGTAATTT CTCGTACATT 16620 

ATGTAACAAA ATTTAAGGTG GGAGCACTAT GCTTAAGGGA CACTTACATT 16680 

TTTAGGTACT GTCGATGGAC CGGGATTAAG ATATATATTA TTTACACAAG 16740 

TAGATGCTTG TATTGCCACA ATCCAGATAC TTGGAAAATT AGTGAGCCAT 16800 

CACAGTTGAT GAAATGGTGA ATGAAATATT ACCATACAAA CCATACTTTG 16860 

TGGCGGTGTA ACAGTCAGTG GTGGCGAACC ATTGTTACAA ATGCCATTCT 16920 

ATTTGCAGAA TTAAAAGAAA ATGGTGTGCA CACTTGCTTA GACACATCGG 16980 

TAATGATACA AAAGCATTTC AAAGGCATTT TGAAGAATTA CAAAAACATA 17040 

ATTATTAGAT ATAAAACATA TTGATAATGA CAAACATATT AGATTGACAG 17100 
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TATGGATTCG ACATGTCCTT GTGCCTGGTT ATTCTGATGA 
TAGGGGAATT TATTAATTCT CTTGATAACG TCGAAAAGTT 
AGTTAGGTGT TCATAAGTGG AAAACATTGG GCATTGCATA 
CGCCCGATGA TGAAGCTGTT AAAGCAGCCT ACCGTTATGT 
CCGTTGAATT ATAAATACAA TTCAGACCGA AAAGAAAGCA 
GGGG CATATG CTTCTTTTTC AATTGAGTAT TGAGTATTAG 
AGACAACTTC TACAATGGTT GAAGGAAGAC GTTTTTGTAA 
TGTGATGTCT TGTTAAAGGT GGGGTTCCAA TATCATCATT 
ATTATTTGCT ACTTGCATAT GAATATGAGT CTTTTCAAAT 
GAAAAATATT AAGATGAAAC TTAATATTAA AgCAATGCGG 
TAGTAAAGAT ATATGGGCAG TATTTAAATT ACTGTATCaA 
TAATGCCTTA CTATTGCAGT TAATCATGAT TTTTATTAGT 
ATTTAATATG ATGTTAAAAG TAGCTGGcAA AGCCAACTTA 
ATCGTTAGTC ATCCCGCCAG TGTGATACTT CTTATTATAT 
CTGATTTATG TAGAGTTTTC ATTGTTAGTT TATATGGTTT 
ATTATTACAT TTAAATCCAT TTTTAAAAAT GCCTTTGTAA 
GTACCAGTTA TTTTCTTTGT CATTTATTTA ATGTTAATGA 
CTAAGTTCAG TATTAACAAA AAATATTTAC ATACCTAAAT 
AAAACGACGA AAGGTATAAT CATTTACGGT ACCTTTATGA 
TTTAAATTAA TATTTACTCT ACCGTTAACG ATTTTAAACC 
ATGAGACTAA GTTGGCAAAT TACGAAGCGA AATAAGTTTC 
ATATTAGAAC TCATCATTGG TGCGATTTTA ACATTAATTA 
GCTATTTGTG TAGATGAAGA AGGAGATAAG TTTTTAGTCT 
TTGAAAAGCG CATTGTTCTT CTATTATkTA TTtACGAAAT 
GTACTGCACT TAA 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1214 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



TAAAGACGAT TTAATTAAAC 17220 

TGAAATTCTG CCATATCATC 17280 

TGAATTAGAA GATGTCGAAG 17340 

TAACTTCAAA GGGAAAATTC 17400 

TATGCAACTT CAAGAGTGAA 17460 

CAAGACGTAG TAAGTATATG 17520 

GTAGCTATGC TGATAAAGAA 17580 

TAGCTGATGT TGAATGGGTT 17640 

TTTTATTGAC CCTGAGTAAT 17700 

AGCGTGATTA TGAAGAGAAT 17760 

AATAAAGGGC GTTTTAGCAT 17820 

AGTACATACT TAATTTTACT 17880 

CGATTAACAA TTGGACGGAA 17940 

TCATATTAAG TGTTGCCTTT 18000 

ATGCCGGCTT TGATCGACAG 18060 

ATGTGCGTAA ACTCATAGGT 18120 

TACCCATTGC CAACCTAGGA 18180 

TTTTAACGGA AGAACTTATG 1824 0 

TTGCTGTATT TATATTAAAT 18300 

GCCAGTCGTT ATTTAAAAAT 18360 

GGCTTGTTAT AGAAATAGTT 18420 

TTTCAGGAGC AACATATCTT 18480 
CATCAATTTT ATTTGTTGTA . 18540 

TATCATTAAT CAGTGTGTTA 18600 

18613 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
AAAGTTTTAA AAGGGGTGAG ATACTTGGCG AATAATCCAT TCCAGCTTTG CGTTTAAAAG 
GAATTATACT TGCCATTGTC GGTGCTTGTT TATGGGGATT AGGTGGTACT GTTTCTGATT 
TCTTGTTCAA ATATAAGAAT ATTAATGTCG ATTGGTACGT CACTGCTCGA CTTGTAGTCA 
GTGGTGTTTT CTTACTTATT ATGTACAAAA TGATGCAACC CAAACGTTCA ATATTTAGCG 
TATTCCAAGA TCGACGTATG TTAGGCAAAT TACTTATCTT CAGTATACTG GGCATGTTAG 
TAGTACAATA TGCTTATATG GCATCTATTA ATACAGGTAA TGCTGCGATT GCAACATTAC 
TACAATACAT TGCGCCAGTT TATATTATTA TTTGGTTTGT CATAAGAGGC GTTGCAAAAC 
TAACATTATT TGATGTGCTT GCTATTATCA TGACACTATT AGGAACATTT TTATTATTAA 
CAAATGGTTC ATTTTCTAAT TTAGTCGTCA ATCCTGCAAG TTTATTCTGG GGTATTTTAG 
CTGGTGTAGC ACTCGCTTTT TACACAATTT ATCCTTCAGA CCTACTTAAC CGCTTCGGTT 
CGATTCTAAT TGTCGGGTGG GCAATGCTTA TTTCTGGTGT TGCGATGAAT TTACGCCATC 
CAATTTGGCA CATTGATATC ACTAAATGGG ACATATCAAT TATATTATTT TTAATCTTTG 
GTATTATCGG TGGTACCGCA CTCGCATTTT ATTTCTTTAT CGACAGTTTA CAATACATAT 
CAGCGAAAGA AACAACATTA TTCGGAACTG TTGAACCTGT CGTAGCCGTT ATCGCAAGCA 
GTCTATGGTT ACATGTGGCA TTCAAACCAT TTCAAATCGT AGGCATCATT CTTATTATGA 
TTTTAATTTT ATTACTATCA CTTAAAAGAC AACCTGAAAC ATTAGATGAA TAAGAAAACT 
CTGATAATCA CTTTAGCAAG TAACTATTAT TTAACAACGT AGTTACCTTA TAGGTGATAT 
CAGAGTTTTT TATTTTAGTT AATAATATTT TTCACTTGGT ATAAAAAaGC GTCGTCGCTC 
TGGTAATCGG AAATACTGGA ATAAAATATG GAATTGGGTA ATAATCCCAG GTAnTAAAAG 
TCCATGTTCC GATAnCCTnT CCGCAnCTCC AACCAAATTT GCCGATAAGG TTCCAAAAGG 
CATCCTGGGG GTAC 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 
ATTTTGGTTT CATTCACGAT GGGGTnATAC AGCAAACACA nCTAAAATAA CTATCAATAG 
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CTTAGACAAT AAAAAATATG CCACTACAAT CGCTAATATT ACGATTAAAA AAGAAGCGTT 180 

AACGATTACT TTCATCGTTG TTCTATCTCT GAACATCATA TTAAAGACAA CTAGACTAAT 240 

TGATAATGAA ACAGCAAAAA AAGTAATAGC TAACACTAAT TTCATCATAA ATAGACAGAC 300 

TAAACCTATG ACTAATAATG TATTAGAAAT TACAGCTGAC GTTTTTAACA TTCTCGaATT 360 

AATATGCACT CACCCTTTTT ATTTAAATAA CTTACATAAT CATAATAATA CATGATGTTT 420 

CATAGGCCTG TCGATGATTG ATTCACAATA GCACGTGATT TTTTTGTTTT TCAATATTAT 480 

TCATTTATTC CATCAAAAAC ACCCTTTTTA ATTTTTACAA AAATTAAAAA AAGTGCTCCT 54 0 

ACACTGCTTG CATGTAGAAA CACTTTTTCA TTGTAATGTT ATTCTTCTCG AGACATACCT 600 

TTTAGCATAT TAAGCATGTA TGTTAAACTA CGGTTCATGT CGTCATCTTT CAATACGCCC 660 

AATAGACTTC TTATAGTTGT CTTAGCATTT GGACTCGCTT GATTGGCAAC GTGTAATCCT 720 

TTATTAACTT TATTTAGGAA GTCGCTTAAA TCTGATACAT TGAGTTCACC TAATAAAAAT 780 

ACCATTGAAG CCATATTAGA TAATAGCCCT GTATAAATAT CTTTATTAAG TTCAACTGCA 840 

AATTTATTTA TGATGACTTG ACGTCCTCGA ATTGCACCAT TTAAAGCATC TAATAGTTTT 900 

25 GCATCATCTA ATGTTTTAAT AAGCTTGATT GCTTTTAATA TACTATCTTT ATTCGCTGCA 960 

ATTGCCTCTG TAACTTCATT TAAACTTTCT AACTTAATTT GTTCTTCTGA TTTTTCTAAG 1020 

CGT.CTAATTT TAGAAGATAT TCTCTCAGCC ATTATTTATC CACCTGATTT CCCGGGAAAA 1080 

CATAATCTGA ACGTTCCCAT TTTTTCTGTA CTTGAACACT GTACTGCGGT TGACGTTTTT 114 0 

TATTGACACG GAAATTATTA GGGTTCAACG GTGACTTACC ACGTTTCGTA ATTACCTCCA 1200 

AACGACAGCT AGTACGTTTA TAAGATGGTG TATCCGTGTA TTGATCAACA TCACTaTTAG 1260 

TTAATAAGTT AATTGCACCT AGATCTCCAT TTTCCATCGC aTCaTTATTT AATGGAATAT 1320 

AGAT3TCTTT ACCTTTAACA CGATCTGTCA CGTGAACTTG TAATACCGCT TCTCCTGTyT 1380 

CAGAAATCAG CTTAACTTCT GCACCTTCAT GAATGCCTCT ATCTTCAGCA AGCTCTGGAG 1440 

AAATTTCAAC AAATGCACGT GGCACTTTGT ATTTAATCAT TGGTGTTTGA TAAGTCATAT 1500 

TACCTTCATG GAAGTGCTCT AACAATCGAC CATTGTTTAC ATGAATATCA TAAATTTCAT 1560 

45 CTTGCTTAAA GTAATTATCA AATGATAATG GGAATAATTT TGCTTTACCA TTATCAAAAT 1620 

TGAATCCTTC TAAGTATAGA ATAGGCTCAT CAGTACCATC AGGTTGTACT GGCCATTGTA 1680 

AACTATTGAA TCCTTCTAAA CGATCATAAC TTACCCCAGC ATATAGAGGT GTTAAGCGTG 1740 

50 

CTACTTCATC CATAATTTCA CTAGGATGCT TGTAATTCCA ATCAAATCCT AATCTATTAG 1800 

CAATTGCTTG GAAAATTTTC CAGTCAGGTT TTkAATCACC AAGAGGTTCT AATGCTTGGT 1860 
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TTGCTGGCAA TACAACATCT GCGTATGTTG CTGTGAATGT TAAAAATTCA TCTTGGACTA 1980 

CCATGAAATC TAATTTTTCA AACGCAGCTT GTACAAAATT AATATTTGAA TCCACAATAC 2040 

5 

CCGTATCTTC ACCATATAAG TACAATGAGT GTACTTCTCC GTCATGTATA CCTTCTACCA 2100 

TTTCATGATT ATCTTTACCA GCTTTTGGAT TCAATTTAAC GCCATATTCT TTTTCAAATT 2160 

TAGCGCGAAT ATCATCCGCT TCAATACTTT GATAACCAGT AATCTTATCA GGCATACTTC 2220 

10 

CCATATCACT ACATCCTTGA ACATTATTAT GTCCACGTAA TGGATACGCA CCAGTACCAG 2280 

GACGACGATA ATTACCTGTT ACTAATAATA AGTTTGAAAT CGCTGTACTT GAGTCACTAC 2340 

15 CAATGTCTTG TTGTGTAATA CCCATTGCCC AACAAATTAC AACAGATTCA GCTTTAGCAC 2400 

ATTCTTCAGC AAATTTAATC AATTCTGATT CAGGAATACC TGTTGCTTCT TCAGCAAAAG 24 60 

CCATTGTAAA TGTTTCTAAT GATTTGTAAT ATTCATCAAA ATCATCTACC CACTCATCAA 2520 

20 TAAATGCTTT ATCGTGTAAA TCATGATCAA TAATATACTT AGTCACTGCA CTTAACCACG 2580 

CTAAATCCGT ACCTGGTTTA GGTTGATAAA AACGATCCGC ACGTTCTGCC ATTTCATGTT 2640. 

TTCTAATATC AAATACATGT ATTTTTTGAC CAAATAATTT TTGTGCACGT TTCATGCGTG 2700 

25 ATGCGATAAC TGGATGAGCT TCGGCTGTAT TAGTACCTAT CAATACAGAC ATTGCCGCTT 2760 

TTTCTAAATC TTCAATACTA CCTGAGTCAC CGCCGTGTCC AACCGTTCTA AATAAGCCTT 2820 

TTGTTGCAGG TGCTTGGCAA TATCTTGAAC AGTTATCAAC GTTATTTGTG CCAATAACTT 2880 

30 

GTCTTGCTAA TTTTTGCATT AAATACGATT CTTCATTCGT CGCTTTAGAA GAAGAAATGA 2940 

ATGATAGTGC ATCTGGGCCA TGCTTTTCTT TAATAGCTGT AAAATTATCT GCAATGACGT 3000 

TTAAAGCTTC ATCCCATTCT ACTTCATGGA ACTCACCATT TTTCCTTACT AGTGGTTTAG 3060 

35 

TTAATCGTTG ATCTGAATTA ATATGTCCCC ATGAAAACTT ACCTTTAACA CAAGTCGCAA 3120 

TTTTATTTGC TGGAGAATCA TGTGATGGTT GTACTTTTAA AATTTCTCTA TCTTTAGTCC 3180 

40 AAACTTCAAA TGAACAACCC ACACCACAAT AAGTACACAC TGTTTTAGTT TTCTTAATAC 3240 

GCTCTTTACG CATTTCTGCT TCTGAATCTG AGATTGCAAA TAGTGGACCA TAACCAGGTT 33 00 

CTGCTTTTTT AGTTAAATCA ATCATTGCTG CTAATGAACC AGGTTCCGTA TCAGTCATAT 3360 

45 AACCCGCATT ACCTTCCATA TTCACTTCCA TCATGGCATT ACATGGACAT ACCGTCGCAC 3420 

ATTGACCACA AGATACACAT GAAGACTCAT TAATCGGTAC ATCATTATCC CAAATAACAC 34 80 

GTGGATGTTC ACGATCCCAA TCAATTCTAA TAGTTTCATT CACTTCGATA TCTTGACATG 3540 

50 

CTTCTACACA ACGCCCACAT AAGATACATT GATTTGGATC ATAACGATAA AATGGGCCGT 3600 

AATCTTTTTC GTATGGCTTC TCTTTATATT CATACGTTTG ATGCTGAAGC CCCCATGCAT 3660 
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TATGCTTTTC TAAAATTCGA TCAAGCGCTT 
CAGTATTTAC AGTCATTGGA CGATCAATCA 
5 CAATCTCAAC AGTACATGTA TCACATGTTT 

TTGAAGGTAC AAAAGTATCT TGTGATTTAA 
CAAGATAATC TTTTCCATCA AGTGTAACCA 

10 

CTATATATAT TTTCCGTAAA TGACTTTTAA 
TGCCCCACAC ATCTTTCAGA TAGAATTAAT 

15 AAGTAAAATT TTGTATTTTG CCTTTTTACA 

ATTAAATCAT CTTTTTGTTT AATTGAAAAT 
TTTCACGCTT TTTGCCATAT CTTTCACAAC 

20 CACCTAAAAA TCGTTATACT ATTTATAAAT 

TTGATAAATA TCTACTATCA TTTAGAAGGT 
GATTAATTTA TAAAAATCAA ATCAGGCATT 

25 ATCACCTTCT ATTTACGGGC TATTAGTTCT 

CTAATTAATT TGTGTACAAT TTTGATAACT 
TCTTTTAATA ACTTAGTACT TTCAGCTTTT 

30 

CCGTCACTTT GAATGCCGCC TTGACCACTC 
TCTTTATAAT TGCTTCTAAT CGTATTCAAA 
TTTTGAATTT CATTCATTAG ACTATTAAAA 

3S 

TTGGCCATCG CTTCAAGCAC AATTTGCTGA 
CCTTCTTCTT TACGACTTCT AATAAACTTC 

40 TGTCCTTTTG TAAAACGAAC ACCATCAACA 

CCGCCGATGC TATCTATCAT ATTATGCAAA 
ATTGGCACAT TCATTAATTT TTCAAGTGAT 

45 TAGGCATGTG CAATTTTTTC AGTAGTACCA 

GGTATACTTA CTATTTCAGT TTTCTTCGTT 

tCACTACGCT CTCCGCCACC CTTTTTCTTA 

SO 

GCGATTGTGA ATGGATCACC ATCGTTTAAA 

TTGCGATCTA ACGGATTGTG TATCTTATTA 
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CTTTTTGAGC ATCTTTCAGA TCATTGTTCA 3780 

CCGTACTACA TGAACGTTCA ATTTTACCGT 3840 

GAATTGGTCC CATCGACTCG TTATAACAAA 3900 

TAAATTCAAG TAAATTCGTA CCTGGTTCTA 3960 

CCAAATGTTC TTGCATATTA CTCACCCCGT 4020 

TAAATTGCTC ATATCCACCT AAAATAACGA 4080 

TTAATTGTAT TACTTTATGT ACTAGTTGTT 4140 

ATCATTTTTA TTTGAAATAT TTTGCGCGAA 4200 

AATTATCATT ATTAGTTTTC CAATTATCTG 4260 

CTTATTAATG ACAATATTTA ATAATCACCT 4320 

ACCCTTTTTC TGAAAATTAA TAACCCAAGT 4380 

AATATTTATC TTTAAATTAA ATTTGTAATG 444 0 

AAATAAAATA GCCCATAAAT ACAAAGTGTT 4500 

ATTCGTTATT CTATTTACAG ATCATTCTAT 4560 

TATTTTCCCT TAGTTTACTA CTCTAGATTA 4620 

GACTGCTCAC TAGGAATGAA GTAGTACAAT 4680 

AATTGATGTT TATTAATCGT GTCATTAGCA 474 0 

TCACCTAATG TTAAATCTGT TTTAACATTA 4800 

TGTGTAATCG ATGATGGGCT TGCAATCTTA 4860 

CGTTGTTGTC GACCAAAGTC ACCACCAGCA 4920 

AATGCTTGAT CACCATTTAC ATGTGTCTGC 4980 

GTGAATGTAT CATTACTTAC TACATCAACA 504 0 

CCATCCATAT CGATTGTCGC ATAATGATCA 5100 

TTAACAGCCA TATTTGGTCC ACCATATGCA 5160 

CGGCCAACAA TTTCCGCTCT TGTATCACGC , 5220 

TTAGGGTTGA TAGATAAAAT CATAATACTA 5280 

CGATCAGCAT CTGAATCGAC ACCAAATAAA 5340 

CTCACTTTTT TATCTCTTAA TTCTGAATGA 5400 

CCAGTAATAA AAATTTTAGC AGCTACATAC 5460 
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GGTAGGCTCA TTTTACTTTT AGACGAACGT 
ACATACTTTG TCTGTTTTCT CTATTTATTA 
5 TGTAGACGTA TAACTATTTT TTATCATTTT 

TGATAACCAC CATTTGCATT TAAAATTTTA 
ATAATAAAAG GTGCACCCTT TAAATGATCA 

10 

AGATTCAATA GTTCTGCAAA TAAAAACTGT 
TTAATATGTG CCCCAAATTG ACCTTTTGCC 
ACTAAACGAT AACTAAATGA GGCGTCAAAT 

15 

TGTGCATTAA ACGATATAAT AGCGTCTTCC 
ATTCCATTAC AAAAAGCACC TTCTCCTCGT 

20 TCATATACGT ACGATAACAT TGGTTTACCT 

TCTTCTTGCT GTTTTACTAA ATTGGCAGTT 
TTAATTTCAT TCGTAATCAT TTCATTACTT 

25 AAATGTGTTG CTAAAAATTG TTGGAATTGT 

AAATCAAATC GATGACGCTT AGTTTCTGTA 
TTGTCTATTT GTTTCAACCA CGAACATATT 

30 GTCATTTCGT CCACCACTTC TCATATCATT 

ATCAACAATA CAACTGAAGA CTTCTTCATT 
CTAGCCGACA AATAGAAAGG AAAGTAAGTA 

35 

GAACGATTAA AAGCTATCTG TTATATAGCT 
ACAAGTAATA TCATAGCACA ATCTTTTTTA 
GATTTTTTTA AAAAGATATT GAAAATGTCC 

40 

AATATATCAT CTTTTAGGAG GTGGCTGTCA 
TTACTGCAAT TCAACAAAAT AAAAAAATGA 

45 TATCCAAACG CACAATTTTA AGAGATATTG 

ATGCGCATTA TGGGAAAAAT GGTGGTTACC 
TAAACTTATC TGAAACACAA TTATCAGCCT 

50 ACTCGACATT ACCATATAAA AGCGAAATCA 

CACAAACACG CTTAAGAAAA TTGCTTAAAC 
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TTCAATCCCA CCACTCCTTT ACTATTCCTT 5580 

TATAGTAAAA TAATTTTTTT ACTATACTTC 5640 

TTATCTCTAG AGAATATCTA TCTGTATTTT 5700 

AGTACCGTTT CATGACATGC TTTATTACTT 5760 

ATTGCCTTAC CATCTAAAGT CGTCATTTTT 5820 

GCAGCAATGT CCCAAGGTTT AGGATTTGTA 5880 

ACTCGCATAG AATCTAATCC GCAAGCACCA 5940 

AAATCTTGCA CCGTATCTAG ATTCATCACT 6000 

AATTTTAACG ATGGTGGTTC TTCCATCTTA 6060 

ATTGCTTTAT AAAGCTTTTT ATGCGGATAA 6120 

TCATAAAAAT ACGCCAATAT AATACAATAA 6180 

CCATCAATGG GATCCATAAT CCATAAATGA 6240 

TTTTCTTCCG CTAATAGTTG GTGTTCCGGA 6300 

TGTTGAATCT GTTTATCTAC ATTTGTAACT 6360 

GTCATTTCCA TAATTAATTG CGGAATAACA 6420 

AACTTATCTA TTTGCTGTAA TGTTTTATCT 6480 

ATCATTTTAT TATTACCCTA TATTAAAAGA 6540 

TTATGCATAA AAAAATCGGC TAGTCACGTG 6600 

ATAAATATTG AAGATGTTGT GATGTAACTT 6660 

CTACCCCTTT GTTTAATCGC TCCCCCTGTT 6720 

AAATGTAAGC GTTTTCCACA AAATTTTTAC 6780 

TCATTGTCAC TCTTATGTTA TACTTTGTGT 6840 

TGAATAAAGC TGAAAGGCAA AATTTAATAA 6900 

CCGCTTTAGA ATTAGCTAAA TATTGCAACG 6 960 

ATGATTTAGA AAATCAAGGT GTTAAAATTT 7020 

AAATACAACA AGCACAATCT AAAATTGCAT 7080 

TATTTTTAGT GCTTAATGAA AGTCAGTCGT 7140 

ACGCAATTAT AAAACAATGT TTAAGTCTTC 7200 

GCATGGACTT TTATATTAAA TTTGATGACA 7260 



656 



EP 0 786 519 A2 



10 



15 



20 



ATGTGATGTT AGTAGATCAT AGGGTTGATG ATAATATTAA AGCTGAAAAC GTTATATTTA 7380 

TTGGCCTTTT GTGTAAACAT GGACATTGGC ATGCAGTCAT TTATGACATT GCTCAAGACA 744 0 
AAACTGCCGA ACTCGAAATT GAAAATATTA TAGATATTTC GTATTCATTC GGTAAGACGA - 7500 

TTCAAACCAG AGACATATCC ATTGATAACT ATCATCAATT TTTAAACCCC ATCGATTCCT 7560 

AAAAAACAGC AGTAAGATGA TTTTCAATTA GAAAATATCT TGCTGCTGTT CTCTATTTAT 7620 

ACAATACTTC GTATTGAATG GnTTCGCTTT CCTAGGGTGC CGTCTCAGCC TTGGTCTTCG 768 0 

ACTGGCACTG CTCCCTCAGG AGTCTCGCCA TTAATACTAC GTATTAACAT GTAATTTTAC 774 0 

TTTGAAATAC TTAAAAAAAT AAAACACTTT GCCCAACTTA CACTACCAAT AGAAACTGCT 7800 

GTTAGAATTC CTCAAAATGA TATTTCGCGA TATGTTAATG AAATTGTTAA AAAGATAGCT 7860 

GATAGCGAAT TCGATGAATT CAGACATCAT CGTGGCGCAA CATCCTATCA TCTAAAAATG 7920 

ATGTTAAAAA TCACCTCATA TTCATATACT CAATCTGAAT TTTCTGGCCG TAGAATAGAA 7980 

AAATTACTTC ATAACAGTAT TCGAATGATG TGGTTAGCTC AAGATCAAAC ACCTTCTTAT 804 0 

AAAACTATTA ATCTTTTTAG AGTGAATCCT AATACTGATG CGCTAATTGA ATCTTTATTT 8100 

25 ATTCAGTTTC ATAATAAAAT GCATATCAAA AAAGCTGATT TCTATCAAAT AATTAATAGA 8160 

AATCAGCTTT TTTCaTTGCC TAAAAACTTA ATGTCCCGAC CTCTTTATCT A CG CAT AAAT 8220 

ACTTATTACT GATATAACGA AAGAAACAAA ATTATTTGCT ATATGTAATG CAATTGTTGA 8280 

ACCTAGGTTT CTTCCAGATT TTAAATAAGT GAAAACTAAT ATGATGGATA GTATGAGATA 834 0 

TGGACCAAAC TCAAACGGCG ACTTTGCATC AGTCACATGA ATAAATGCAA ATAAGAACAC 8400 

CGAAACAATA CTCATAGCTA TAAAATTAAA CTTCTTACCT AATTCTCCAA TTAAAATATG 8460 

TCTAAATACG ATTTCTTCAA CTATTGGACC TACAATCACA ATTAATAAGA ATGCTACAGG 8520 

TAAAAATGCA GGCACTTCAA ACATTTTATT TAGCTCAAGT TCATTGGCTG TTtCACTATA 8580 

TTGCAAATGT TTAGGTAGAA ACTGTGTCAT ATATTCATAT GTATAAATTA AGATGAGAGC 864 0 

AATAATATAC GTTATTGACA ATCTAAGCCA ATATTTTTTG ATATACGCAA AACCAGCTCG 8700 

AAGCCTTGAT GGCATCACTT TTAAATGAAA TAAATAAAAT GCGCCAATCC CAATCGTATA 8760 

45 TGCTAAAGCT TGTGTGATAG TCGCTACAAA TATCAGATTA CTATCGATTT CATAATAACC 8820 

AAACAAAATT GGTCCTATGT AAGCTGCAAT TGTGAGTGCA TAAAATATAA CACCTATAAT 8880 

TGGAATTATA AGCAAATCTC TCCATGCTAT ATCTTTAAAC GTGTATTTCT TTTTTTCATT 894 0 

SO TTCCaCTGTT ATATCCtTTC CTGTTTAATA ATTGATTTTT GGAGGTACTT CTACATGATA 9000 

AACGAAACTA AGTATATGAG ACAACAAATT ACTAATTTGA TTCAAATCAT TGATACGATT 9060 
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ATAGTTACTA ATGAATTGAA TAAGTTCAAA GGCTTTGAAA CATCATATAT AATAAACGAA 9180 

AATCAAGTTT CCTATTATGA AATTATAACA CTACTTAATA AACGTCCCCT CgACAAGTCG 9240 

5 ACTATGGTAA CAAAATTCAA TATCTTAATT TTTATCATAC AGAACTATCT AACGCATTAT 9300 

TTGCAATTAA ATTTGCCCAT TAACCTATTT TTCATAAAAT GTCATTTAAA CAAGTTATTT 9360 

ATTAAAATTC ACTTTATTAC ATAAATTATA CAATTArAAA GTTTCTTCAA ATTGTAAAGA 942 0 

10 

TGCATTAATC GAGTTATAAT CATAATGATT AAGATGGT 9458 
(2) INFORMATION FOR SEQ ID NO: 115: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 910 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 
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AnGCGTATCA 


TGTCACGCAT 


TTTAACTACT 


TCTTTACCAC 


AAGATTATAC 


AGTCACATTA 


60 


GTTGATCGTA 


TGCCATTTCA 


TGGATTGAAA 


CCAGAATTTT ATGCTTTAGC 


TGCGGGCACG 


120 


AAATCAGATA 


AAGATGTTCG 


TATGAAATTC 


CCTAATCATC 


CACAAGTGAA 


TACAGTTTAT 


1B0 


GGTGAAATTA 


ACGACATAGA 


TTTAGATGCT 


CAAATTGTCT 


CAGTCGGTAA 


TTCTAAAATT 


240 


GATTATGATG 


AGCTAATCAT 


TGGTTTAGGA 


TGTGAAGATA 


AATATCATAA 


CGTTCCAGGA 


300 


GCCGAAGAAT 


ATACACATAG 


TATTCAAACA 


CTCTCAAAGG 


CTCGGGATAC 


TTTCCATAGT 


360 


ATTAGTGAAC 


TACCAGAAGG 


TGCTAAAGTC 


GGTATCGTTG 


GTGCTGGATT 


AAGCGGCATA 


420 


GAACTTGCCA 


GCGAATTAAG 


AGAAAGTAGA 


TCAGACTTGG 


AAATATATCT 


TTATGACCGT 


460 


GGGCCGCGAA 


TTTTAAGAAA 


TTTTCCAGAA 


AAATTAAGTA 


AGTATGTTGC 


GAAATGGTTC 


540 


GCCAAAAATA 


ATGTTACCGT 


TGTTCCAAAT 


TCAAATATTA 


ATAAAGTTGA 


ACCTGGTAAA 


600 


ATATATAACT 


GTGATGAACC 


TAAAGATATT 


GATTTAGTTG 


TATGGACAGC 


AGGAATTCAA 


660 


CCTGTTGAAG 


TTGTTCGTAA 


CTTGCCGATT 


GATATAAATA 


GTAATGGACG 


CGTGATAGTT 


720 


AACCAGTATC 


ATCAAGTACC 


AACATATCGT 


AACGTCTATG 


TAGTTGGTGA 


TTGTGCTGAT 


780 


TTACCACATG 


CGCCAAGTGC 


TCAGTTAGCC 


GAAGTTCAAG 


GTGATCAAAT 


TGCCGATGTG 


840 


CTTAAAAAGC 


AATGGCTAAA 


TGAACCATTA 


CCTGACAAAA 


TGCCGGAACT 


AAAGGTACAA 


900 


GGTATCGTTG 
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(2) INFORMATION FOR SEQ ID NO: 116: 
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(A) LENGTH: 10182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

10 TTTTTGATTC AAAGTGGTGA TTTAACAAGC ATTTTAAATA GCAATGATTT GAAAGTCACA 60 

CATGATCCTA CCACTGATTA TTATAATTTA TCTGGTAAGT TGTCGAACGA TAATCCAAAC 120 

GTTAAACAAT TAAAACGTAG ATATAATATT CCTAAAAACG CATCAACAAA GGTGGAATTA 180 

AAGGGAATGA GTGATTTAAA AGGCAATAAT CATCAAGATC AGAAACTTTA TTTTTATTTT 240 

TCAAGTCCTG GAAAAGACCA AATCATTTAT AAAGAAAGCC TTACTTATAA TAAAATAAGT 300 

GAACATTAAT ACTTATGCTG TAATTATAGA AACATCCAAA TCATCTATTA nAATCCTATA 360 

TTATAAAAnC ACCTCACATA ACTCGTTCAA CTGTACCAAA CCACATTACA TTAGATTTTA 420 

GGCTAACTAT TGTGATGTAC ATCAAAAACG AATTTGTGAG GCGTTGTATA TTTTACAAAG 480 

GTGACTAGCG TTTCGTATAG CATTTCCAAC ATTACTACAC TCAAGCGTCA CGCTAAAGTT 540 

CGAAATCGAA TCCTTTCATT CAACAAAAGC TCATATCCAC TACAAACTTC ATATCAAGCG 600 

TATAAACTAT CTTGTGATAC TATCTCGATC ATATCTATAG TATGCATTTG TGTTCCGTTT 660 

30 CACTGAAGTA TATGTATCAT CAGTTAAGTA TAAACCGTCA TCCTTCAATG TTACTTGATA 720 

AGCATATTTC CGTGCTAACC AGGCAATATC TATATAATTT TCTCCTGCGT TTTCATAACT 780 

TCTTAAATCT TCAATATGTG CACTAACTTC AGGGaAAATG ATTCTAACAA CACTTTCATC 840 

AACCCAATAT TTGTCATGCA TCCATCGCAC TTGATCTGCC AATAAAGGTA ACTGCACATC 900 

ATTGAAATAT AGACGAAAGC CGTCACTATC ATACATTTGC CGATATGGTA ATGGCTGTTT 960 

TCTAATCACT AACACCTCGC CACCCATTAC GGTGCCTTCT CTAGTATCAT CACTTCCACC 1020 

CGAAGCTTCA TACGTTGTTG GGTCAACCTG TAGTCCATGT ACATCTCCAA TATAAGCATC 1080 

TGGTTTATGT TCCATTGCAT GTCCATGTGC AATCAATGCT AATATTGTAG ATTGTGAAAA 1140 

TTGAGGCTCC CATTCAATGC GATTAGGATG GCTACTATAA ATTCTAGGTT CATCTATAGC .1200 

CTGCTGAATA TCCATGCCAA ACACTAATAC ATTGATTAAT GTTTGCGCAA CACTAGCAAT 1260 

GATACTTATG GCACCAGGTG CACCTACTGT TAATATTGGC TTCCCGTGAT ACATCACAAT 1320 

SO CGTTGGAGCC ATGTTACTTA GTGGTCGTTT ATATGGTGCA ATTTCGTTAA TACCACCATC 1380 

TACTACATCA AAGCCATCCA TTGTCGTATT CAATAACACA CCGTAGCCTG GAATCGTGAT 1440 

ACCTGAACCA TAAATCATAC CAATTGATGT CGTAAATGAA GCAATATTAC CTTCCTTATC 1500 
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ATCAGACACA ACACCATGCT CTATATCAAT ATTTGCTTTA TTGCTATCAA TGAGCGTACT 1620 

GCGTGCTTTT AAATAATCAT CATCAATTAA TGACTGTACA GGCACCTCAT GAAAATTATC 1680 

ATCCGCCAAG TATTGCGCAC GATCACTATA TGCTAAATGC ATCGCTTGTA TCAAATGATG 1740 

CAAGTAATCA ACAGATCTTG GACCCATAGA TGGTAAATCG ACATGTTCTA ATAACTTCAA 1800 

TATTTGAATT ACCGTGATAC CGCCAGAACT AGATGGTCCC ATTGaATAAA TGTCATAGTC 1860 

TTTAAATGTT GCACTGATTG GCGCTTTAAT CTGAATGTCA TATTTGGCTA GATCCTCTAA 1920 

AGTGATTGTC CCACCACATG CTTTGACAAC ATTGACTAAT TGTTTCGCAA TGTCACCTTT 1980 

ATAAAATGCA TTAAACCCTT GTTCTCTTAA TATTTGAAAT GTCTTACCTA ATTCGGGTTG 2040 

TACAATCCAA TCACCTTCAC GCCAATATTG ATTTTCATGC GTAAATACTT GTGCCGTTTC 2100 

ATGATACTTT GTCAATCGTG CGTGTTGCTG GCGCGAATAT TTTTCAGTAG CCCAATTGGC 2160 

20 TGCATGACCT TCAATGGCTA GTTCAATTGC AGGATTAATT AAATCTTCCA ATGACAATTT 2220 

AGCATAACGC TTGTGAATAT AATCAAACAG CTTTGGAATT GCTGGCACAG CGACAGTTTT 2280 

ACCATGTGTA GTCATATCAA AAAATGATTT ATATTCGCCT GAATCATCTA GATAAAATTG 234 0 

TTTGTCTACA TGTTCAGGTG CTGTCTCACG TGCATCAAAC GCAGTTATAC TGCCAGTACT 2400 

TTGCTCATAA TATAGCAAAT ACCCGCCACC ACCAATACCT GATGCAAATG GTTCTACCAC 24 60 

ATTCAATGCC AGTTGAATTG CAATCACTGC ATCCATGGCG TTGCCACCTT GATCTAATAC 2520 

ATCCTTACCA ATTTTAGCCG CAAGAGGATG TGATACGGAA ATTAACCCTT CTTTAGATGT 2580 

TTTTGTCTGT TTGTCATTTA AGTTAATGAC CATACTATAT CCTCCTACTT TCTGTTAAAT 264 0 

35 ATTTAAAACA TTATTGATTA ATGGCTTTTT CTACTTTTTC TAAATCTTGA CGTTGCTCGT 2700 

TACCAGTATC GACAAGTGGT GTAATCGGTG ATGCAATTTT AAATTTATCG CCACGATAAA 2760 

ACTfAATAAA TTGATCCTGA TCTATCGCAT TAACTACTGC TTGTCTCAAG TTTGGATGCG 2820 

40 TCTTAAATAT ACCTTTTTTA ATATTTAGCA TTAAAAAGAC TGACTTGCGT CCATTTTTGC 2880 

GAATAATGCT TAAATTTTTA TCCGACTTAA TTAAATCAAA ATGTTTTTGA TTCACATCTG 2940 

CCAACATATC AATTGAATGA TTTCTAAGTT CTGACAATGC ATTATTCGGG TCACCATTAA 3000 

ACTTCAATGT AATATTTTTA ATTTTAGCTG GTCCATAACT ACCTTTTTCT GTTTCGTTGA 3060 

ATCCTGGATT ACGTTGAAAC GTTGCTTGAT ATGCATTTTT CTGTGTCATA ATGTATGCGC 3120 

CACTTGCATA CAGCGCATTT TTCCCATCTG AATTTGCAGG AATTGTACTG CTATCCCCAT 3180 

ATCCTTrTGG ATATTCTTGA TTTACTTGAT TAACAAATTT TTTAGATAAA ATGCCTGCCG 324 0 

AAGAGTGTGT TAAGTAATTT ACCTCTCGAG GCATCGATTG ATCTGTCGTA ATTTTAACAA 3300 
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TATAAGCTTT AATCAACTTA TCATAGATTG ATTTATCGTC CTTGTCTTTC TCTTTACGCA 3420 

ACTGATCGAT GTCCTCATCT TTTAATATCT TGATGTCATT TATATGTTTG TGCATATTGT 3480 

AAGTATTATT GTTAGGCACA GACTTTTTAT CACGTGCTCT ATCTAAAGAA AACTTAACAT 3540 

CTTCAGCCGA TACACGCTCT CCAGTATTAC GTGCTTGTCC ATTGACCACT TTCGCAAAAT 3600 

AATCATCATC TCTTAACAAG AAATAAAATG CTTTATTGTC CTTATTCACA GCATAATCAT 3660 

GACTTAACGA ACCTTTCGTT GTTAAATGAT CATTTTCATC TAATAATAAT AACCTTGTGT 3720 

ACATATTCAT ATTAATTGAA TATACTGACG GCGCAATTGA ACGTATTGGA TCCAATGTAG 3780 

GAATTTCACC ATCTTGTTGT GTCATCACAA GTGGCCGCGT ATCTCGTTCT CTACTATTGT 3840 

TGTAATCAAA TTGTTGCCAT ATTAATGCAC GTGAATTTGG CAATCCAACA CTATTTTTAT 3900 

CTAACACTTT ATTGTCATAT ACTAAATTCT TTTTTGATCC ATATAAAGGC GCCATATACC 3960 

20 CTTTATCAAA TACAACTTCA TCTTCAATTT GCTTATATGT TTGTTTAACA TCTGCTTCAT 4 020 

TTTGAGTAGA AGCTTTATTT AACAACTGGT CTACATGTTT ATCTTTCAAT AAACTATTTG 4080 

ATCCTGTAGA ACTAAATAAT GCCGTCATAG CATAGTTCGG GTCACCAAAC ACTGTCATCC 4140 

AGTCATCAAT TTGGATATCA TAATTGCCGG CTTGACGTTG TGTACGATAG CTACCATAAT 4200 

CTGGTTGGAT ATTCATCTTC ACGTTAAATC CTGCATTTTC CAATTGATCT TTAACGATAT 4260 

TCATATCATT TTCATAACTT GCTTGTCCTA GGAAATGTAT TGTTGGTCGC TCGCCTTTCA 4320 

CTTCAACTTT CGATGACTTT TGAGCCACTT CTGATTTCGT AGGGACACCA CAACCACTTA 4380 

ATACCAACGC TAAAACTATA ATTGCGATAC TAATGATTTT CTTCACATCT ATCCCTACCT 4440 

TTTTAATGAA TTCTTGGATC TAGTGCATCA CGCACTGCAT CACCTATAAA ATTAAATGCT 4500 

AAAACGACGA ACATAATACA AACACCAGGT ACAATAGCTA AATTACTGTG CGTTTCCAAG 4 560 

TAGTTACTAC CGGTACGTAA AATGTTGCCC CATTCAGCTA CATCAGGTGC AACACCAAGT 4620 

40 CCTAGGAAAC TTAAACTACT TGTTGTTAAT ACAACCACAC CTATATTTAA TGAAAAACGT 4680 

ACAATCATAG GCGCAATCGC ATTCGGTAAA ATATAACGCC ATATGATATT CCAAGTGTTT 4740 

TCACCAGTGA TACGTGCTGC ATCTACATAT TCCATGCGTT TAATTTCTAA AACACTGGCA 4800 

CGCATTGTCC GTGCAAATGA TGGTATATTA CCGATACTTA AAGCAATAAT TAAATTTGGA 4860 

ATACTTGCTC CAAATGATGC AATAATTGCC ACCGCTAACA ATAATGATGG AATTGCAAAC 4 920 

ACTACATCTA AAATTCGCAT TATTAAATTA TCAATATGAT TAAAATAACC TGCGATAGTG 4 980 

CCTAGTAACA CACCAAAAAT AACTGCAATA ACTACTGAAA -TAATTGAAAT TGAAAATGTC 5040 

AGCTTCGTTC CTACAACTAC GCGTGTAAAT AAGTCTCTAC CGAAATCATC AGTACCAAAC 5100 
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GTATCAAATG 


TAAATTGTGA 


CA CAATTGAT 


AATGTCAGCA 


TGTAGACTAA AATAAGTAAC 


5220 


CCGATAATCG 


CAATACGATG 


TCTAGTAGTT 

X ♦ M^\J X X X 


TTTCGTATAA ACGATTCCCA 


CCCGTTATAA 


5280 


CTATGTATTT 


GCGATGTACG 


TTGGTAACGT 


CTAATACTTA 


CAAACATTAA 


TAATGTAAAT 


C 1 A f\ 

5340 


ACGTTGCCTG 


TTAATGTCAT 


CAACAATAAC 


AACACTTCGA 


CGATACGTCG 


CCATAGGTCA 


5400 


TGATGCTTCC 


ATGTTTGTTC 


CGTTGTTAAA 

X X \J X X ****** 


ATAATAATTA 


AAATGATGGT 


TAAAACGATT 


5460 


AGCAATGTTT 


CAGCAATATA 


GAACGTATCG 


GCCACATAAC 


CTTTAAAAAG 


ATTTAATGCA 


5520 


CTCGTTAATA 


TAACTAAAAT 


ATAAGTTGCT 

-** X ****VJ X X \mp x 


ATGGCGTAAC 


TTGCGAATAA 


TTTTAAGGAA 


5580 


G CTATCTTTG 


AATTAAGTTG 




CTCACTTCCT 


TTCGTTGATT 


TCACTACGTA 


5640 


ATTTTGGATC 


GATTAAAGCA 


TAAAATATAT 


CAATAATTAA 


GTTTGCTAAA 


GATATTACAA 


5700 


TTGATATATA 


TACGACCCCA 


CCCATGACTG 


CTGGAATATC 


AGGTATTAGT 


TGTTTTTGGA 


5760 


CGATATAACG 


CCCGATACCA 


TTAATGTTAA 


ATACTTGTTC 


CGTCACTGCT 


GAACCGCCTA 


5820 


GTAACTCTGC 


CACTAGAAGA 


CCAACTAACG 


TTACAATTGG 


AATAATGGCA 


TTTTTCAAAA 


5880 


TATGTTTAAT AACAACTTGT GTCGTCGATA ATCCTTTTGC ATAAGCAGTT AAAACATAAT 


5940 


CGctGCGCAT 


TACTTCAAGT 


ACAGAAGACC 


TTGTCATACG 


CGTGATAGAA 


GCAGCAATAC 


6000 


TTGTTCCAAT 


GACAAGTACA 


GGTAAAATCA 


ACGATATTGG 


ATGTTCTGGC 


ATATAAGATG 


6060 


GTGGCAAAAT 


ATCCAATTTC 


AATGAGAACG 


CTAAAATGAA 


TAATAGCCCT 


TGCCAGAAAC 


6120 


TTGGAATAGA 


TAAACCAATT 


AATGCAATTA 


TCATTAACGT 


GATATCAAGC 


CAACTATTTC 


6180 


GCTTCATCGC ACTGATAATA 


CCAATTGGTA 


TTGCAATAAT 


TAATGCCACC 


ATTAGCGCTA 


6240 


ATACTGCGAC 


AATTATTGTA 


ATTGGAATTC 


TTTCGCCAAC 


TGCTTTAGTC 


ACAACCTCAT 


6300 


TCCCTTTGTA 


AGTCGTACCT 


AAGTCAAAGG 


TAAAAACACC 


CTTGATGGTA 


TCCCACAATT 


6360 


GAATSAAATA AGGTTCGTTA AGATGATGTA ATACATTGAA TTGATGTATC 


TGTGCCTTTG 


ca on 
6420 


TTGCATTTTG 


TCCCAGTATG 


CTATAAGCCG 


CATCAAGCGG 


TGAAAAATAC 


AGAATGGTAA 


£ji OA 

64 SO 


ACACACTGAC 


AATAACACCA 


ATGATGACAA 


TCACAGCCAT 


GACAATTCGT 


TCAAAAATAT 


CCA f\ 

DD4U 


ATCTAACTAA 


TGGCTGTAAA 


TAAAAAGTCA 


ATAAGATGAA 


CATCGGCAAG 


GCCAATATCA 


OD 00 


CTTTGATCAT 


GATGAACTTA 


TGAAATAATA 


CATTTTCAAA 


GTATGTTGAA 


AAATGTGCTT 


6660 


GTTCAATATT 


CTTTGAACTC 


GTATTAGAAC 


TTTGTGCCTT 


GAATATTTTT 


AATGCTTCTT 


6720 


TATGTATTTG 


TGTGGATGAC 


TTTTGCTGCG 


ATAAATATTT 


ATATTTTTGA 


TGTAACGCCT 


6780 


GTTCAATTTC 


TGAAATTTCA 


GAATTATTAG 


CGTAAAAATT 


TTTCCTCTTA 


GCAGAAAAGA 


6840 


AAAACTTTAT 


CACTGCATAT 


AAAAATATTG 


GCAAGCTTAA 


TACCGATAAT 


ACAAACTTGT 


6900 
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CTTGTAAAAT 


AATCTTGAGT 


AGATTACTAT 


GATATACAAA 


AGTATAGAAT 


AAATTTACAC 


7020 




ATTTGTGaAT 


AGGGAGGCAC 


AACATCATGT 


CAAATTTATT 


AGAAGTCAAC 


AGTCTGAATG 


7080 


5 


TACAATTCAA 


TTATGATGAA ACTACAGTTC 


AAGCGGTAAA 


AAACGTCTCT 


TTCGAATTAC 


7140 




GAAAAAAACA 


TATCCTAGGT 


ATTGTTGGTG 


AATCAGGATC 


AGGAAAAAGT 


ATTACCGCTA 


7200 


10 


AATCTATTTT 


AGGGCTACTA 


CCAGATTATC 


CAGATCACAC 


ATTAACAGGA 


GAAATTATTT 


7260 


TTAATGGGCA ATCGTTAAAT 


AATTTATCAA 


CTTCAGCGTT 


ACAACAAATT 


CGAGGTAAGG 


7320 




ATATTTCAAT 


GATTTTTCAA 


GATCCACTCT 


CTTCGTTGAA 


TCCAAGATTA 


ACGATTGGCA 


7380 


15 


AACAAATTAC 


AGAAGTAATA 


TTTCAACRTA 


AACGTGTATC 


TAAATCTGAA 


GCAAAGTCGA 


7440 




TGACAATAGA 


CATTTTAGAA 


AAAGTAGGTA 


TAAAACATGC 


AACTCGACAA 


TTTGATGCTT 


7500 




ATCCACATGA 


ACTTTCTGGT 


GGTATGCGTC 


AACGTGTCAT 


GATAGCAATG 


GCATTGATTT 


7560 


20 


TAAAGCCACA 


AATTTTAATC 


GCAGATGAaC 


CAACAACGGC 


ATTAGATGCC 


AGTACACAAA 


7620 




ATCAATTACT 


GCAGTTAATG 


AAGTCCCTTT 


ATGAGTACAC 


AGAAACATCT 


ATTATTTTTA 


( 7680 




TCACTCACGA 


TTTAGGCGCT 


GTGTATCAAT 


TTTGCGACGA 


TGTGATTGTA 


ATGAAAGATG 


7740 


25 


GAAGTGTCGT 


TGAAAGTGGC 


ACGGTTGAAA 


GTATTTTTAA 


ATCGCCACAA 


CATACCTATA 


7800 




CAAAACGCTT 


AATAGATGCG 


ATTCCTGATA 


TTCATCAAAC 


GCGTCCGCCA 


AGACCGTTAA 


7860 


30 


ACAATGATAT TTTATTAAAA TTCGATCGCG TGAGyGgGAT TACACATCAC 


CGAGTGGCAG 


7920 


CCTATACCGA 


GCAGTTAATG 


ATATTAACTT 


GGCTATTAGA 


AAAGGCGAAA 


CATTAGGCAT 


7980 




TGTCGGTGAA 


TCAGGGTCAG 


GGAAATCGAC 


ATTAGCTAAG 


ACGGTCGTCG 


GTCTAAAGGA 


8040 


35 


AGTGTCAGAA 


GGCTTTATTT 


GGTATAACGA 


ATTACCATTA 


AGTTTATTTA 


AAGATGATGA 


8100 




ATTGAAATCT 


TTACGACAAG 


AGATACAAAT 


GATTTTTCAA 


GATCCATTCG 


CATCTATTAA 


8160 




TCCAAGATTT AAAGTCATTG 


ATGTGATTAA 


ACGACCACTA 


ATCATTCATG 


GGAAAGTCAA 


8220 


40 


AGATAATGAT 


GACATTATTA 


AAACTGTCGT 


ATCGTTGTTA 


GAAAAGGTTG 


GCCTAGATCA 


8280 




AACTTTCTTA 


TATCGCTATC 


CACACGAATT 


ATCTGGTGGG 


CAACGTCAGC 


GTGTAAGTAT 


8340 




CGCGAGAGCA 


CTTGCTGTTG 


AACCTAAAGT 


GATTGTTTGC 


GACGAGGCAG 


TGTCCGCTTT 


8400 


45 


AGACGTTTCA 


ATTCAAAAAG 


ATATCATCGA 


GTTATTAAAA 


CAATTACAGT 


TAGACTTCGG 


8460 




CATCACTTAT 


TTATTCATCA 


CACATGACAT 


GGGTGTTATC 


AATGAAATAT 


GTGATCGCGT 


8520 


SO 


TGCAGTTATG 


AAAAATGGCG 


AAATCGTTGA 


ACTGAATAAC 


ACAGAAGATA 


TTATCAAACA 


8580 


TCCGCAGTCA GACTATGCAA AGCAACTTAT 


TTCAGAAGTA GCAGTTATTG 


CTAAATAAAA 


8640 




GTCATGCGTT 


GTGCAACTTT 


ATCACTGTAT 


GGTCTGAAAT 


AAATTGCGCG 


ACTTCTGATG 


8700 
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TATCAAGTTT TAGGTGCTTT GCCATGATTT AAGAGTCACC CCCATACTTT GGGCATTTTA 8820. 

ACGCCAGAAT AAATCCCCCG CCACTATGTG AAGTGTGGGG GATTATTTAT ATTTTATTAG 8880 

AATATTCAGA TTTTTGAGTG TGTCAACTTA GCTTAGTCAA TGTATATTTA ACGTCACTTA 894 0 

CTCTTTTTCT TTCATAATTA ACACATTCAA ATAAACTTTG ATCAAAAAAC ACAAAGTTAA 9000 

AAGTACCATC TTGTAATATG CTCTCATACA TTATCCCGTC ATATTTAAGG CTTCGAATAT 9060 

AATCAGCTAA ATATTGAAAT GGCAAATAAT CTATTCCTTG TTCATCGCTT GGATTTGTTA 9120 

TTCCTTTATG AATCTTTTTT AATGTTTGGT AATTTACAAA ATACTTTCTA AATCCATCAT 9180 

CGCCAGCTTT GATTGCATTA CTAGTTAAAT TAGTTAAATT CGCAATTTTC AATTTCTCTT 9240 

TTGTCACGTT TTTTTGTAAC TTAACCTTAC CTATATAAAT AATGTCATTA TGCTTAGGTT 9300 

TAACTTCTTC TATACTGACC TGTTCTTTTG TACTAAGGTA TAATACGCTT ATCCATTTAG 9360 

AATTCAATCT TCCTGCCGTT GCAAATCCCT TTGGTGGTGA CATTAGTTCA CTTTTCTCTG 9420 

TAATGAACTT AACTATTCTA GATCTATATA ATGGTTCAAA TCTTTCTCTA AATTCCTCAA 94 80 

TACTATAGTA ATTAGTAGTG ATATCGAGAA AGAACGCTAA ATTCTCTAAA TTGATCATAT 9540 

TTTTATGAAA TCTATTTTTA TACTTCAAGC TCTCACAAAA TCCATCCCAG TCATTATTTG 9600 

CTACAATTAG ATTTTTATTT GTATATTTTT TATCGTTTAT GATTTTAGCG CCTACTAAAT 9660 

CTTCCAACAC TCGTCTATCT AAATTTTCAT CATCTTTAAA AAGTTCATTT AAAATACAAC 9720 

TTATTTGAGC TTCCTCAACA TTAAATATAC TCCAGTCGTC TTTTAATGCT ATTTCAATCT 9780 

TTTTACCTTC TTTTGGGCTA AAAGTATCTG GTAAATTTAT ACTAATATCA TATAATTCTA 9840 

ATGCTGGTCT TAAATAATCT CTAATAAGTT CTAATTTATC TATGTCCTTA GTCGTATCAA 9900 

ATATTTTAAC ACCAAGATGA TTGTTATCAA TATCACAATT GTCAAATTTG CTATTTATCA 9960 

TTTGCAATGA TTTCTACGAT TTCAGTATTA TTAAAACATT TTTCACATAT TTTCATTTTG 10020 

AGACfCCAAG TATCTATTCA TAATTTCTAG GTGATGCATG ATAGATAACC TTTTAATTAA 10060 

ACCTAATCCT GGATaCTTAT TATTTTCATT TAATTCTTCA AATTGTCCCA AGCGCATAAG 1014 0 

ATCTATTTTT AATATCTAAG TTTTTTGACC ATGTTACTAA TT 10182 
(2) INFORMATION FOR SEQ ID NOT 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3491 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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AACTCAGGCA ATTGAAACAG CATTAGGTGC 
AAAAGATGGA CGCCAGGCTA TTCAATTTTT 
5 TTTACCATTA AATGTTATAC AGAGTAGAGT 

AGAGGCAAAC GGATTTATTA GTATCGCTTC 
AAATATTATC GGGAATTTAT TAGGTAATAC 

10 

TGAATTGGCA CGTGCGATTA AATATCGAAC 
AAATCCTGGT GGtTCTATGA CTGGTGGTGG 

1S AAAAGACGAG TTGACAACAA TGAGACACCA 

ATTTGAACAA CAATTTAAAG AGTTGAAGAT 
TGAAAAAAGT CAAAAGCATA ATACACTTAA 

20 CGATAGATTA ACTACACAAG AAACACAAAT 

AAAAAATGAT GGTTATACGA GTGACAAAAG 
TCTAGAAAGT ATTAAAGCAT CTTTAAAACG 

25 ACTTTCTAAA GAAGGTAAGG AAAGCGTTAC 

ATCTGATCTT GCTGTGGTTA AAGAGCGTAT 
AAATAATCAA AATCAACAAA CTAAACATCA 

30 

CTTTAATTCG GATGAAGTGA TGGGCGAACA 
TGGTCAACAA GAAACGAGAA CACGCTTATC 
TATTGAGTTG AATGAACAAA TCGATGCGCA 

35 

TATTTTAGCT ATCGAAAATC ACTACCAAGA 
ATTAATTCAT CATGCGATAG ATCATTaAAT 

40 ArATCTGAAT ATACGaGTGA TGrATCGATg 

AGaTGyCGAT TGATGrACTA GGTCCTGTAA 
TAAATGAACG TTATACATTT TTAAGTGAAC 

45 CATTAGAGCA AATTATAAGT GAAATGGATC 

TCCATGCTAT TCAAGGACAT TTTACAGCTG 
CAGAATTGCA ATTAACTGAA GCCGATTATT 

SO 

CACCGGGTAA AAAGTTGCAA CATTTATCGT 
CTATTGCTTT ACTATTTGCA ATTTTAAAAG 

55 



TTCATTACAA CATGTCATTG TAGATTCAGA 60 

AAAAGAACGT AATTTAGGTC GTGCGACGTT 120 

GGTAGCGACT GATATTAAAT CTATTGCTAA 180 

GGAAGCAGTT AAAGTAGCAC CAGAATATCA 240 

GATTATCGTT GATCATTTAA AGCATGCAAA 300 

TCGTATTGTT ACTTTGGAAG GTGATATTGT 360 

CGCTCGTAAG TCAAAAAGTA TTCTGTCTCA 420 

ATTAGAAGAT TACTTGCGTC AAACAGAATC 4 80 

AAAAAGTGAT CAATTAAGTG AACTGTATTT 540 

AGAGCAAGTG CATCATTTTG AAATGGAGCT 600 
AAAAAATGAT CATGaAGAAT TCGAATTTGA . 660 

TCGACAAACT TTGAGTGAAA AAGAAACTTA 720 

ACTAGAAGAT GAAATTGAAC GCTACACAAA 780 

TAAAACACAA CAAACCTTAC ATCAGAAACA 84 0 

TAAAACACAA CAACAGACAA TAGATCGATT 900 

ATTAAAAGAT GTTAAAGAAA AAATTGCATT 960 

AGCTTTTCAA AATATTAAAG ATCAAATTAA 1020 

AGATGAATTA GATAAATTGA AACAACAACG 1080 

AGAAGCTAAA CTACAAGTTT GTCACCAAGA 1140 

TATTAAAGCT GAACAATCAA AGCTAGATGT 1200 

GATGrATATC AATTGACTGT TGAACGTGCG 1260 

ACGCATTACG TAAAAAAGTT AAGTTAATGr 1320 

ACTTAAATGC AATTGAACAA TTTGAAGAGT 1380 

AACGTACAGA TCTTCGTAAA GCTAAAGAAA 1440 

AAGAGGTTAC TGAAAGATTT AAAGAAACTT 1500 

TGTTCAAACA ATTGTTTGGT GGAGGCGATG 1560 

TAACAGCTGG TATTGATATT GTGGtACAAC 1620 

TACTGAGTGG TGGTGAGCGT GCATTAACTG 1680 

TAAGATCTGC ACCTTTTGTT ATATTAGrTG 174 0 
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TATCAGACGA AACACAATTC ATTGTTATTA CACACCGTAA AGGAACAATG GAATTTGCAG I860 

ATAGGTTATA CGGTGTAACA ATGCAAGAAT CAGGTGTTAC TAAACTTGTG AGTGTGAATT 1920 

TAAATACAAT AGATGATGTG TTGAAGGAGG AGCAATAATG AGCTTTTTTA AACGCTTAAA 1980 

AGATAAGTTT GCAACAAATA AAGAAAATGA AGAAGTTAAA TCCTTAACAG AAGAACAAGG 2040 

TCAAGACAAA TTAGAAGATA CACATTCTGA AGGTTCAACG CAGGACGCAA ATGATTTAGC 2100 

AGAAAATGCT GAAGTGAAAA AGAAGCCACG CAAGTTGAGT GAAGCGGATT TTGATGACGA 2X60 

TGGCTTAATA TCAATTGAAG ATTTTGAAGA AATTGAAGCT CAAAAAATGG GTGCTAAATT 2220 

15 TAAAGCAGGA CTCGAAAAAT CTCGTCAAAA TTTCCAAGAA CAATTAAATA ATTTGATAGC 2280 

GAGATATCGT AAAGTAGATG AAGACTTTTT TGAAGCTTTA GAAGAAATGT TAATCACTGC 234 0 

AGACGTCGGT TTTAATACAG TGATGACGTT AACTGAAGAA TTACGTATGG AAGCACAACG 2400 

20 ACGTAATATT CAAGATACTG AAGATTTGCG TGAAGTCATT GTTGAAAAGA TCGTAGAGAT 2460 

TTACCATCAA GAAGATkATA ATTCAGAAGC TATGAACTTA GAAGATGGTC GTTTAAATGT 2520 

CATTTTAATG GTTGGTGTGA ATGGTGTTGG TAAAACAACA ACAATTGGAA AATTAGCTTA 2580 

CCGATATAAA ATGGAAGGTA AAAAAGTAAT GTTAGCTGCG GGCGATACTT TTAGAGCGGG 2640 

TGCTATTGAT CAATTGAAAG TTTGGGGCGA ACGTGTTGGT GTAGACGTAA TTAGCCAAAG 2700 

TGAAGGTTCT GATCCAGCTG CTGTTATGTA TGATGCgATT AATGCCGCTA AAAACAAAGG 2760 

TGTTGATATT TTAATCTGTG ATACCGCTGG ACGTTTACAA AATAAmACAA ATCTAATGCm 2820 

AGAATTAGAA AAAGTTAAGC GTGTAATTAA TCGAGCAGTG CCAGATGCGC CTCATGAAGC 2880 

ATTACTATGT TTAGATGCTA CAACTGGTCA GAATGCGTTG TCACAAGCTA GAAACTTTAA 2940 

AGAAGTAACA AATGTTACAG GTATTGTATT AACGAAATTA GATGGTACAG CCAAAGGTGG 3000 

TATCGTATTA GCCATTCGTA ATGAATTGCA CATCCCAGTT AAATATGTAG GTTTAGGTGA 3060 

40 GCAXTTAGAT GACTTACAAC CATTTAACCC TGAAAGTTAT GTCTACGGCT TATTCGCTGA 3120 

TATGATTGAA CAAAATGAAG AAATAACAAC AGTTGAAAAT GATCAAATTG TAACAGAAGA 3180 

- AAAGGACGAT AATCATGGGT CAAAATGATT TAGTtAAAAC GTTACGAATG AATTATTTGT 324 0 

TTGATTTTaT CAATCCTTAT TGACGAATAA ACAACGTaAT TATTTGGAAT TATTTTATCT 3300 

TGAAGATTAT TCTTTAAGTG AAATCGCAGa TACTTTTAAT GTGAGTAGaC AAGCAGTTTA 3360 

TGATAATATA AGAAGAACTG GCGATTTAGT TGAAGATTAT GAAAAGAAAT TGGAATTATA 3420 

CCAGAAATTT GAGCAACGCC GAGAAATATA TGATGAAATG AAACCACATT TAAGTAATCC 3480 

AGAACAAATA C 3491 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4253 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

10 





AGTACGTTTT ATAATTATAA GTACGTAATT AACATATTAA CATATCGCAA GTATGTATTT 


60 




AAATAAgATT GTTATAATTT CAAAGTTCAT CCAAGaTTAT GGCGTTTGCA TTTACCTATT 


120 


15 


AAAAACGTTA TTATATCAAA 


GATGCGAAAG 


ATAATACGGG TTTATTTTAT GAAAGTGAGA 


180 




AGGATAAAAT GGATAATGAG 


CAACGCTTAA 


AAAGAAGAGA GAATATAAGG AATTTCTCGA 


240 




TTATAGCACA TATTGACCAC 


GGAAAATCTA 


CATTGGCTGA TAGAATTTTA GAAAATACCA 


300 


20 


AATCAGTTGA AACAAGAGAT ATGCAAGATC 


AGTTACTAGA TTCAATGGAT TTAGAAAGAG 


360 




AACGTGGTAT TACAATCAAA 


TTAAACGCgT ACGTTTAAAG TACGAAGCTA AAGATGGAAA 


420 




TACTTATACA TTCCATTTAA 


TCGATACGCC 


TGGACACGTC GATTTTACAT ATGAAGTGTC 


480 


25 


ACGTTcTTTG GCAGCTTGTG 


AGGGCGCGAT 


TTTAGTAGTA GATGCGGCTC AAGGTATCGA 


54 0 




AGCACAAACA TTAGCAAATG 


TTTATTTAGC 


ATTAGATAAT GAGTTAGAGT TATTGCCTGT 


600 


30 


TATTAACAAA ATTGATTTAC 


CTGCTGCAGA 


ACCTGAACGC GTGAAACAAG AAATTGAAGA 


660 


TATGATAGGT TTAGACCAAG 


ACGATGTTGT 


TTTAGCAAGT GCTAAATCTA ACATTGGAAT 


720 




TGAAGAGATA CTAGAGAAAA 


TAGTTGAAGT 


TGTGCCAGCT CCAGATGGTG ACCCAGAAGC 


780 


35 


ACCACTAAAA GCGTTAATAT 


TTGATTCTGA 


GTATGATCCA TATAGAGGGG TAATTTCATC 


840 




GATAAGAATT GTGGACGGTG 


TTGTTAAAGC 


CGGAGATAAA ATTCGAATGA TGGCCACTGG 


900 




TAAAGAGTTC GAAGTAACAG 


AAGTTGGAAT 


TAATACACCT AAGCAGCTTC CAGTTGATGA 


960 


40 


ATTAACAGTT GGTGATGTTG 


GTTATATTAT 


TGCAAGTATT AAAAATGTTG ATGATTCTAG 


1020 




GGTTGGTGAC ACCATCACAT 


TAGCTAGTAG 


ACCTGCATCA GAACCATTGC AAGGTTATAA 


1080 




GAAAATGAAT CCAATGGTAT 


ATTGCGGACT 


GTTCCCAATA GATAACAAAA ATTATAATGA 


1140 


45 


TTTAAGAGAA GCATTAGAAA AATTACAATT 


GAATGATGCA TCATTAGAAT TTGAGCCTGA 


1200 




ATCGTCACAA GCATTAGGTT 


TTGGTTATAG 


AACTGGTTTC TTAGGTATGT TACACATGGA 


1260 


50 


AATAATTCAA GAAAGAATTG 


AAAGAGAATT 


TGGTATTGAA TTAATTGCAA CTGCACCATC 


1320 


TGTAATTTAT CAATGTGTTT 


TAAGGGACGG 


TTCAGAAGTG ACGGTTGATA ACCCAGCACA 


1380 




AATGCCAGAT CGTGATAAAA 


TTGATAAAAT 


ATTTGAGCCA TATGTTCGTG CAaCTATGAT 


1440 
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TATAAATATG GACTATTTAG ATGATATTCG TGTAAATATT GTTTATGAAT TACCTTTAGC 1560 

TGAAGTTGTA TTTGATTTCT TCGATCAACT TAAATCTAAT. ACTAAAGGAT ATGCATCATT 1620 

5 

TGATTATGAA TTCATCGAAA ATAAAGAAAG TAATTTAGTC AAGATGGATA TTTTATTAAA 1680 

TGGTGATAAA GTGGATGCGC TAAGCTTCAT AGTTCATAGA GATTTTGCAT ATGAACGTGG 1740 

TAAAGCATTA GTTGAAAAAC TTAAAACGTT AATTCCAAGA CAGCAATTTG AAGTACCTGT 1800 

10 

ACAGGCTGCA ATAGGACAAA AAATTGTAGC GCGTACAAAT ATTAAATCAA TGGGTAAAAA 1860 

CGTTTTAGCT AAATGTTATG GCGGTGACAT AAGCCGTAAA CGTAAATTAC TTGAAAAACA 1920 

75 AAAAGCAGGT AAAGCTAAGA TGAAAGCAGT TGGTAATGTT GAAATTCCAC AAGATGCTTT 1980 

CTTGGCTGTA TTGAAAATGG ATGATGAATA ATTTTAAAAA ATCAATTAAC AATTTACAAT 204 0 

GAATAAAGTT TAATAACTAA AAAGAGGGAG CCTAGGATAA ATTAACGTCC TGGGCTTTAC 2100 

Of) 

AATGTTATAT TGGCAGCCAT CGACAGAGTT AAAATGAGCT TATAACAATG GGGCCCCAAC 2160 

ACAGAAGCTG ACGAAAAGTC AGCTTACTAT AATGTGCAAG TTGGGGTGGG GCCCCAACAT 2220 

AGAGAATTTC GAAAAGAAAT TCTACAGGCA ATGCAAGTTG GGGTGGGACG ACGAAATAAA 2280 

25 

TTTTGCGAAA ATATCATTTC TGTCCCACTC CCTTATGCAT GAGTTTTACT CATGTAATTT 2340 

TATTTTTAAG GACATATTAC ATCTGGCTAA TGTGTAAGAG CCACTACATA ATAAATCATT 2400 

AGTGGTTCTT TATTATTTCT ATCTCACTCC CTCTAAACAA GAATAAATAT TAAAATGAAT 2460 

30 

CGATATATTA GACAATCATT GATTAAACGT TAAAGTTAAA AGTAAGAATA ATTGCAGATA 2520 

GTCCAACAGG ATATAGCCGA TTGGATAAAA AGTCTGAGAA GCGGGGCATT AAAATGACGG 2580 

35 TACAAAGTGC ATATATACAT ATTCCATTTT GTGTAAGAAT ATGTACATAT TGTGATTTCA 2640 

ATAAATATTT TATACAGAAT CAACCTGTAG ATGAGTACTT AGATGCACTA ATCACAGAAA 2700 

TGTCTACAGC AAAATATAGG ATCTTAAAGA CCATGTATGT AGGTGGCGGC ACACCAACGG 2760 

40 CCCTTTCTAT TAATCaGTTG GAAAGATTAC TTAAAGCAAT ACGTGATACG TTTACAATCA 2820 

CAGGCGAGTA TACATTTGAA GCAAATCCTG ATGAGTTAAC TAAAGAGAAA GTCCAACTAT 2880 

TAGAGAAATA TGGAGTAAAA AGGATTTCAA TGGGCGTTCA AACATTCAAG CCGGAGTTAT 2940 

45 

TGTCTGTTTT AGGTAGAACG CACAATACTG AAGATATTTA CACTTCGGTG TTAAATGCTA 3000 

AAAACGCAGG TATTAAATCA ATCAGTTTAG ATTTAATGTA TCATTTACCG AAACAGACGA 3060 

TTGAAGATTT TGAACAAAGT TTAGATCTAG CTTTAGATAT GGATATTCAA CATATTTCGA 3120 

50 

GTTACGGCTT AATACTTGAA CCTAAAACCC AATTTTATAA TATGTATAGA AAAGGCTTGC 3180 

TCAAACTTGC TAATGAGGAT TTAGGTGCTG ACATGTATCA GTTGCTGATG TCTAAGATAG 3240 
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5 



10 



20 



AACATAATAA 


GGTTTACTGG 


TTTAATGAGG 


AATATTATGG 


ATTTGGAGCA 


GGTGCAAGTG 


3360 


GTTATGTAGA 


TGGTGTGCGT 


TATACGAATA 


TCAATCCAGT 


GAATCATTAT 


ATCAAAGCTA 


3420 


TAAATAAAGA 


AAGTAAAGCA 


ATTTTAGTAT 


CAAATAAACC 


TTCTTTGACT 


GAGAGAATGG 


3480 


AAGAAGAAAT 


GTTTCTTGGG 


TTGCGTTTAA 


ATGAAGGTGT 


GAGTAGTAGT 


AGGTTCAAAA 


3540 


AGAAGTTTGA 


CCAATCTATT 


GAAAGTGTCT 


TTGGTCAAAC 


AATAAATAAT 


TTAAAAGAGA 


3600 


AGGAATTAAT 


TGTAGAAAAG 


AACGATGTGA 


TTGCACTTAC 


AAATAGAGGG 


AAAGTCATAG 


3660 


GTAATGAGGT 


TTTTGAAGCT 


TTCCTAATAA 


ATGATTAAAA 


AAAATTGAAA 


TTTCGAGTCT 


3720 


TTAACATTGA 


CTTACTTTGA 


CCAATTTGAT 


AAATTATAAT 


TAGCACTTGA 


GATAAGTGAG 


3780 


TGCTAATGAG 


GTGAAAACAT 


GATTACAGAT 


AGGCAATTGA 


GTATATTAAA 


CGCAATTGTT 


3840 


GAGGATTATG 


TTGATTTTGG 


ACAACCCGTT 


GGTTCTAAAA 


CACTAATTGA 


GCGACATAAC 


3900 


TTGAATGTTA 


GTCCTGCTAC 


AATTAGAAAT 


GAGATGAAAC 


AGCTTGAAGA 


TTTAAACTAT 


3960 


ATCGAGAAGA 


CACATAGTTC 


TTCAGGGCGT 


TCGCCATCAC 


AATTAGGTTT 


TAGGTATTAT 


4020 


GTCAATCGTT 


TACTTGAACA 


AACATCTCAT 


CAAAAAACAA 


ATAAATTAAG 


ACGATTAAAT 


' 4080 


CAATTGTTAG 


TTGAGAATCA 


ATATGATGTA 


TCATCAGCAT 


TGACATATTT 


TGCAGATGAA 


4140 


TTATCAAATA 


TATCTCAATA 


TACAACTTTA 


GTTGTTCATC 


CTAATCATAA 


ACAAGATATT 


4200 


ATCAATAATG 


TACACTTGAT 


TCGTGCTAAT 


CCTAATTTAG 


TTATAATGGT 


TAT 


4253 



30 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQU.ENCE CHARACTERISTICS: 

(A) LENGTH: 3395 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



40 - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

TCCCTAATCG AACAAAATTA TGCGCATAAA CAAAGTAGAT TGATATAAAA TTCTTAATTA 60 

TCAGAATATA TTTACAAATC TGAATTTTAT TAGTATATTG GrTAGTrTTC ATAGAGGCAT 120 

45 

GACGGTaTTT GAGCAGGATT TTAAATCGGg ATTTTATAAT CGATTTAAGA GAGGCCACtT 180 

TGCTTGcACA TTAATACTGT CAATGGGAGG GGAATGTATA TGAGTrAAGC ACATCAATTA 240 

ATTCAAGAGG ATGAACATTA TTTTGCGAAA TCAGGACGTA TTAAATATTA TCCGTTAGTG 300 

SO 

ATTGATCATG GATATGGAGC AACATTGGTT GATATTGAGG GGAAGACATA TATCGATTTG 360 

TTATCGAGTG CGAGTTCTCA AAACGTAGGT CATGCACCTA GAGAAGTAAC AGAAGCGATA 4 20 
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GTACGTTTAG CTAAGAAGCT TTGTGAGATT GCACCTGGAG ATTTTGAAAA AAGAGTGACC 540 

TTCGGATTAA CCGGATCAGA CGCAAATGAT GGCATCATTA AATTTGCCAG AGCATATACA 600 

GGGCGTCCTT ATATCATTAG TTTCACTAAT GCATATCATG GTTCAACTTT TGGCTCATTG 660 

TCTATGTCAG CTATTAGTTT AAATATGCGC AAACATTATG GTCCGTTATT GAATGGTTTT 720 

TATCATATTC CGTTTCCAGA TAAATATCGT GGTATGTACG AGCAGCCACA AGCTAATTCA 780 

GTAGAAGAAT ATTTAGCACC CTTAAAAGAA ATGTTTGCGA AGTATGTACC TGCTGACGAA 840 

GTAGCATGTA TTGTTATTGA AACGATACAA GGCGATGGTG GACTTTTAGA ACCAGTTCCA 900 

15 GGGTATTTTG AAGCGTTAGA AAAGATTTGT CGTGAACATG GTATTTTAAT CGCTGTCGAT 960 

GATATTCAAC AAGGTTTTGG GAGAACAGGT ACATGGAGTT CAGTCTCGCA TTTTAATTTT 1020 

ACGCCTGATT TAATCACTTT CGGAAAATCC TTAGCAGGTG GTATGCCTAT GTCAGCAATT 1080 

GTTGGACGCA AAGAGATTAT GAATTGTTTA GAAGCACCAG CACATTTATT TACAACAGGT 114 0 

GCTAATCCAG TTAGTTGTGA AGCTGCATTA GCCACAATTC AAATGATTGA AGATCAGTCG 1200 

CTTCTTCAGG CTAGTGCGGA AAAAGGGGAA TATGTTAGGA AACGAATGGA TCAATGGGTA 1260 

TCTAAATACA ATAGTGTAGG CGATGTTAGA GGTAAAGGTC TGAGCATTGG TATTGATATT 1320 

GTTTCCGACA AAAAACTCAA AACACGTGAT GCCAGTGCGG CACTTAAAAT TTGTAATTAC 1380 

TGCTTTGAGC ATGGCGTAGT TATTATAGCT GTAGCAGGAA ATGTGTTGCG ATTCCAACCG 144 0 

CCATTGGTAA TAACATATGA GCAATTAGAC ACGGCGTTAA ACACTATAGA AGATGCACTG 1500 

ACTGCTTTGG AAGCAGGTAA CTTAGATCAA TATGACATAT CTGGACAAGG TTGGTAATAG 1560 

35 CGATTATCTT AATATAAAAT AAAAAATCAT TTCCACATCT GGATGTTAAT CAGATGGGAA 1620 

ATGATTTTTT TTATTTTTTA TTTTGGTGGG TGGTATTCAG CTACGTCATT TTTCTTAGAA 1680 

TGTGTAAGTC CATAACTTAA ATATAGGATG ATACCAACAA TAAACCAAAT TAAAGTGTAT 1740 

40 AATTTCGCTT CGAATCCTAA TCCCCAGAAT ACTAGCAATA CTAAAACAAA TGTAATTGCT 1800 

GGTAACACAG GATATAAAGG TAATTTAAAT GCAGGAATTG GTAGATCTTT ACCTTcACGC 1860 

TTTCTCAAAC GATACATTGC TAATGAAACG AACATAAATG CAACAAGTGT ACCTGCTGAA 1920 

ATTAATTGTG CTAAAAATGC GAATGGGAAC ATAGAACCAA TTAAAACACC AATAATAGTA 1980 

AGTATAACTA GTGCGCGATT AGGTAAATGT TTGTCGTTTA AGTGGCTTAA CCATGAAGGT 204 0 

AATAAGCCGT CACGTCCAAA TGAATAAAGT AAACGTGAGC CTGCTAACAT CATACCAATT 2100 

AATGCTGTAA ACATACCGAT AACAGAGATA GCTTGAACAA TAGCTGCTAC AACACCATGA 2160 

CCACTTTGAC GTAAAGCCCA ACCAACAGGT TCAGCATTGT TTGCGTATTG TGAGTAATGG 2220 
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CCAAGAATAC CTCTAGGCAT. TGTCTTTTGA GGATCAAGTG CTTCTGCTGA GTTTGCTGCG 234 0 

ATAGAATCGA AACCGATATA CGCTAAGAAA ATCATTGAAA CACCAGCATA TATGCCTTGC 2400 

CATCCACCAA AGTCACCTGT AGCAGTTACT TTGTGTTCTG GAATAAATGG CACATAGTTA 2460 

CTAACATTTA TTGCTGTTAA ACCTACGATG ACAAATAAAA TAATAGCTAA TACTTTTAAA 2520 

ATAACTAAAA TATTTTCCAT ACGAGCTGCT TCCGACATAC CACGTGATAG TAATAATGCA 258 0 

GTTAATAAAA TAACGATAGC AGCAATAATA TCGATAAAAC CGCCATTTGT ACCAAATGGA 264 0 

TTTGATAATG CTGCAGGTAA TTCGATGCCA ATTGGTTTCA CAAGTCCGCG TAAATTCGCT 2700 

GAGAATCCTG ATGCAACAAA GGCTACGGCG ATAAAATATT CAGCTAATAG AGCCCAACCG 2760 

GCAACCCATC CAAAAAATTC ACCAAATAAT ACATTGACCC AAGAATAGGC TGAACCTGCA 2820 

AATGGCATAG CGGCAGCCAT TTCTGCATAA GTAAATGCAA CTAAACCAGC AACAATAGCA 2880 

20 GCGAGTAAGA ATGATAACGC AACGGCCGGT CCTGCATGTT CTGCAGCAAC AATGCCAGGT 2940 

AGCGTAAAGA TAGATGTCGA TACAATTGTT CCTACACCTA AAGCTAAGAA ATCACGCACC 3 000 

CGAAGTGTAC GCTTTAAATG ACCATCTTTA TTTTGATAGA TAGCCGGATC CTCTTTTCGT 3060 

GCTATTTTAT TGAAAAAACT TCCCATAAAC TTTCCTCCCA AACATTCATA AACAATTCTA 3120 

TACGGTGTTT TTTAATATGT TATATCATAG CACAAATAAT CAATATTTTG TCTAAAAATT 3180 

CTGAAAAATC ACAACTTTAT GTTACGTATT AATGACTTGT CTTGATAACA TC CAT AG ATT 3240 

TTTTAAATGA TAAAACTGAT TATAACAGAT ATTAAATGAA TAAGTACTAT TTTTTGCnAA 3300 

TTTTCTAACA ATTTTGCACA TTATATGTTT AAAATCAATT TCATGTTTAT GGTCTGATTG 3360 

GCTAGTGTGT ATGAAATGTA AnTCTTTGAC TnnGA 3395 

(2) INFORMATION FOR SEQ ID NO: 120: 

~ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13508 base pairs 
40 (B) f TYPE: nucleic acid 

(C) STRANDED NESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

ATCAGGTAAT GCCATGCGTT TAGCTGAAAA TTTTTTCAGA ACGTTTAAGT GATATCGGAC 60 

ATCAAGTTGT TTTGATGTCA ATGGATGAAT ATGATACGAC AAACATCGCG CAGTTAGAAG 120 

ATTTATTTAT TATTACGTCT ACTCATGGTG AAGGAGAACC GCCTGATAAT GCATGGGATT 180 

TCTTTGAATT TTTAGAAGAC GATAACGCAC CTAATTTAAA TCATGTGAGA TATTCAGTAC 240 
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TACTAGAAAA TCTAGGCGCT GAGCGTATAT GTAAGCGTGT AGATTGTGAT ATTGATTATG 360 

AAGAAGACGC AGAAAAGTGG ATGGCAGACA TCATTAATAT TATTGATACC ACATCAGAAG 420 

GTATTCAAAG TGAATCGGTG ATAAGTGAAT CAATTAAGTC TGCCAAAGAA AAGAAATATT 480 

CTAAATCAAA TCCATACCAA GCAGAAGTAT TAGCGAATAT CAATTTAAAT GGTACCGATT 54 0 

CAAATAAAGA AACACGACAT ATAGAATTTT TACTTGATGA TTTTAGTGAA TCATATGAAC 600 

CAGGAGATTG TATAGTAGCA TTACCGCAAA ACGACCCTGA ATTGGTTGAA AAACTAATAT 660 

CCATGTTAGG TTGGGATCCG CAATCTCCGG TGCCAATTAA TGATCATGGT GATACAGTTC 720 

CTATTGTTGA AGCACTAACA TCACATTTTG AATTTACTAA ATTAACATTG CCATTATTGA 780 

AAAATGCAGA TATCTATTTT GACAATGAAG AATTATCTGA ACGTATTCAA GATGAGTCAT 84 0 

GGGCGCGTGA ATATGTTATA AATCGGGACT TTATAGATTT AATAACAGAT TTTCCAACTA 900 

20 TAGAATTACA ACCTGAGAAT ATGTATCAAA TCCTTAGAAA ATTACCACCA AGAGAGTATT 960 

CGATTTCTAG TAGTTTTATG GCAACGCcAG ATGAAGTGCA TATTACCGTT GGTACGGTTC 1020 

GTTATCAAGC ACATGGACGT GAGAGAAAAG GTGTATGCTC GGTTCATTTT GCTGAGCGAA 1080 

TTAAACCAGG CGATATAGTA CCAATTTATT TGAAGAAAAA TCCGAACTTC AAATTTCCGA 1140 

TGAAGCAAGA TATACCGGTT ATTATGATTG GACCAGGTAC TGrAATTGCT CCTTTTAGAG 1200 

CATATTTACA AGAACGTGAA GAACTTGGTA TGACTGGAAA AACATGGTTG TTCTTTGGTG 1260 

ATCAACACCG TAGTTCTGAC TTTTTATATG AAGAAGAAAT AGAAGAATGG CTTGAAAATG 132 0 

GAAACTTAAC ACGCGTAGAT TTAGCATTTT CAAGAGACCA AGAACACAAA GAATATGTAC 1380 

AGCATCGTAT AATGGAAGAA AGTAAACGTT TCAATGAATG GATTGAGCAA GGCGCACAAT 144 0 

CTATATTTGT GGCGATGAAA AATGTATGGC GAAAGATGTC CATCAAGCCA TTAAAGATGT 1500 
ATTGGTAAAA GAACGTCATA TTTCTCAAGA AGAAGCAGAG TTATTATTGC GACAAATGAA ' 156 0 

40 ACAACAACAA CGCTATCAAC GTGATGTTTA TTAGCGATTG GTGTTAAATA TTTTAAGGTG 1620 

TAATGATGTA AAAAGATATA AAGGATGTTG CTCAACATGA ATATGCCATT AATGATAGAT 1680 

TTAACAAATA AAAATGTCGT CATAGTTGGT GGAGGCGTCG TTGCAAGTCG TCGGGCACAA 1740 

ACATTAAATC AATACGTTGA ACATATGACG GTCATCAGTC CGACAATCAC TGAAAAACTT 1800 

CAAAATATGG TAGATAACGG TGTCGTCATA TGGAAAGAAA AAGAATTTGA ACCAAGCGAT 1860 

ATTGTAGACG CGTATCTAGT TATTGCAGCA ACCAATGAGC CACGTGTCAA TGAAGCGGTA 1920 

AAAAAAGCCT TACCTGAGCA TGCCCTTTTT AATAATGTTG GAGATGCATC AAATGGCAAT 1980 

GTTGTATTTC CAAGTGCACT ACACCGCGAC AAGCTAACTA TCAGTGTATC AACTGATGGT 204 0 
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TACAGTTCGT 
ACATATAACG 
GACAAACAAG 
CCGTCTAAGG 
GGCGGCTTTG 
TATGAATATT 
ATTGAAATCT 
TAGAAAGCAA 
TAGTCATGCC 
AGATTTTTTT 
AAGTTAAAGA 
AGTGATAATA 
CTCATTGCCA 
ATGATTTTAG 
GTCATCAACG 
AGTGGTTTAT 
ATTGTCTGTG 
GTAACATTTT 
AAATATTTAT 
GGAGGTCCAA 
TATGCATTAA 
CTTCSSATTTA 
ACACAAGATA 
TTTCCTGTAA 
ACTTTAGCTG 
ATAGGTGTTy 
TTAAGAGCgC 
TTTTTAGGTG 
GTGATTCCTG 



ATATCGACTT TTTATATACT 
AAAAGCAACA GTTACTGTCA 
CTCAATTTTT AGCGTGGTTG 
TAAGTCTTCT TATTTTAACT 
TGAATAGTCT AATAATGAAG 
ACAATAGAGA AAAAGATACG 
GGTTGAAGTC GTTACTATCA 
TAATAAAAAT GATAGATCAA 
ATGTTATCAA GTAGGAAAAT 
GATAAAATGA GATAACTTAA 
GGGGGATTAT GTAAATTGTA 
TGTGGTTAAC AGTAATGGGG 
AAAAGATAAA TCCAGTTGTA 
GATATAGTGT GACAGATTTG 
TTGTTATTAT GTTTATCTTT 
TCAAGCCGCT TGTCAAACGC 
CAATGACAGC TTTAATTGGC 
TGCTTTCTAT TCCTGCATTA 
TGATTTTACT ATTAGCATTA 
TGGCTCGTGT AGCTGCAGTG 
TACCTATTCA AATAATAGGT 
AAGAACAGAA ACGTATCAAA 
TAGATGTACA TAAATTAGTT 
AAGGACGTGC AAGAACAAAA 
TTATTCTATC GATGTTAATA 
CGTTGGCACT TGTTATTAAT 
ATGCGCCGAA TGCATTAATG 
TACTAAATGA AACCGGTATG 
CAGAAGTAGG ACCATACTTG 



TGCCGACAGA 
CAAATTGTGT 
GATGTAAGAT 
TTAACGCTTA 
GATTTAAGCG 
TAGAACAAAC 
TAGCGACCTT 
AATGAAATAC 
CAAACTTCAC 
ATATAAAAAA 
TTAAAAGTGG 
CTCATTATTA 
GGTATGACAA 
GTTGGATTTT 
GCCATTATTT 
TTAATATTAA 
ACAATAGCCC 
TTACCTTTAT 
AGCGCGGCGA 
TTAAAAGCCA 
TTCATTCTTG 
AAAGCAATAG 
GAAGTATATG 
TCATGGATAA 
AATATTGCGC 
TTTAAATCAG 
ATGGCTGCAG 
CTTAAAGCGA 
CATATTATTG 



AAATAAAAGT 
CACAAGAATA 
AATAATAGCG 
ATCATTGAAA 
ATAATGATAT 
TTAATAAAAT 
TAGCCAGATT 
AGGACAGGAT 
TATTGATAGT 
TTATATTAAT 
AGGGAGAAAA 
TTATTTCAAT 
TCATACCTTG 
TTGCTAAAGG 
TCTTTGGCAT 
TGACACGAGG 
AATTAGATGG 
ATAAAGCGTT 
TTATGAACAT 
AAAGTGTCAA 
TTATGTTGTT 
AGAGAAA7GA 
AACGAGATCA 
AATGGGTGAA 
CACCTGAATT 
TGGATGAACA 
TGATTATTGC 
TTGCGACCAA 
TAGGTTTACT 



ACTTGATATA 
TTTAAATCAT 
GACCGTGTAA 
TTAAGACATG 
GCGTTTTAAA 
AGGTGGATAA 
TTTTGTGCAA 
ATACAAGGAT 
TACGCAAAAA 
TATAATATTT 
TAATATGAAT 
TGTAGGTTTA 
CTTAGGGGCA 
GTTAGATCAA 
CATGAACGAT 
CAATGTCGTC 
GGCCGGTGCG 
AAATATGAAT 
GGTACCTTGG 
TGAATTATGG 
TGCGGTATAT 
ATTACCGCAA 
AGATGTAAGG 
TACAGCTTTA 
TGCATTCATG 
AATGGAACGA 
AGCAGGTATG 
TTTAATCAAA 
TGGCGTACCA 



2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 
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ACAGCAGGGC 
ATTATAGGTA 
GAGGCAAACA 
GTTATGTTAG 
AACTATGGTC 
AATCTTTAAA 
ATGTTAATAA 
ATAGTGGTAT 
ATCGAAGAAG 
AACAAAGATA 
TCTTACGCAT 
ATACATTTTC 
GATGCTAAAA 
AAATCAATAC 
CAATATGGTG 
TAATGAAGAA 
GTTATGTTGA 
AAATTAATAC 
TAGGTACAGA 
GCTTAGGAAA 
GAACC^ACTT 
CAACGAAACG 
ATTTCAAAGC 
ATTTTGAAAA 
CATTAGAATC 
TGTTACATGA 
CACCGATTTT 
CTGAGTTATT 
CAGCTACTGT 



AATTTGGTGT 
CATTTGTCAG 
TGGGCACGTA 
TAATTGCAAT 
ACGTTGCAAA 
CAATTTTAAA 
CTTTGTGACG 
AACGTAATGA 
CATTAACGGG 
GTGATACATT 
TAAATCATTT 
ACGACTTAGA 
ATATGCTACA 
AAACTGCATC 
GCTGTAcGGT 
CAACATAGGA 
TCAACAAGTC 
CTTATATACA 
TCATTTAAGT 
AGACCGCACG 
TAGTCCGCAA 
TATGTATCCA 
GCCAATGGGT 
TAATGGTCGT 
TGCCGGTAAT 
TGCATTACTT 
ATATAAAAGT 
TAAAAATAAA 
TTTCTATGGT 



ACCGTCTGTA 
CCCATTTTCA 
TATTAAGTAT 
GTTGATGGGC 
ATGAAATAAT 
AATTAATGAA 
TTTTAGTTAA 
GTAGACACAA 
TTTGATTTCT 
TTCAACAATG 
ATTACCAAAG 
TTATCATCCA 
TAATGGATTT 
AGCGCAGCTT 
TGACCgCGTT 
ATATsCGCAA 
ACTAAAGACA 
TCTAATGGAC 
CGCAAAATTC 
ACAGCGATTT 
GATCCGAACT 
GATATTTTAA 
TGTCGTTCAT 
TGTAATCTTG 
ATGACGAAAT 
TATCGTATAA 
GGCGCATTTA 
CGTGCAACGA 
CCAGACTGGG 



TCAACAGCTT ATTCAATGGT CATAGGGAAT 3 960 

CCAGCCTTAT GGTTGGCAAT TGGTTTAGCA 4 020 

GCATTCTTTT GGATTTGGGG ATTCGCTATC 4080 

ATTGTGACGA TTTAAGTATG AAAAAATAGA 4140 

AGTTGCATAA ACATGTCGAA ATGACGGACG 4200 

ATAATTGTGT AGAAATATGA ATTTCACTAA 4260 

CAGACTAATA AAAATTTGAA AATACTATAT 4320 

TATATAGGAA GAAGGGGTAA AATGAATCAA 4380 

AAAGATCCTG CTATTGTTAA CGAAAATGCT 444 0 

AGAGATTTAA CAGCAGGTAT CGTTTCTAAA 4500 

CACGTTGCAG ATGCACATCA AAGAGGGGAC 4560 

TTCCAACCGT TAACTAACTG TTGTTTAATA 4620 

GAAATAGGCA ACGCGAATGT AACTTCACCA 4680 

GTACAAATTA TAGCCAATGT TTCTAGCAGT 4740 

GACGAATTAC tTAGTACATA TGCACGACcA 4 800 

AGCAATTTGT CAAAGAATCT GAAATTGATC 4860 

TCAATGATGC GATTGAAAGT TTAGAATATG 4920 

AGACACCTTT TGTAACATTA GGATTCGGCT 4 980 

AACAAGCTAT CTTAAATACT CGTATCAAAG 5040 

TCCCGAAACT TGTATTTTCA ATTAAAAAAG 5100 

ATGACATTAA ACAACTAGCA TTAAAGTGTT 5160 

ATTATGACAA ACTCGTAGAA ATATTAGGTG 5220 

TTTTACCAAG TTGGAAAGAT GCGGAAGGTC 5280 

GTGTTGTTAC ACTTAATTTA CCTAGAATGG 5340 

TCTGGGAAAT CTTTTATGAA CGTATCGATG 5400 

ATCGTTTGAA AGATGCTGTA CCGAATAACG 5460 

ACTATAAATT AAAAGAAACA GATGATGTTG 5520 

TTTCAATGGG CTATATAGGG TTGTATGAAA 5580 

AAACATCTCA AGAAGCAAAA GCATTTACGC 5640 
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10 



GGTTCAGTAT TTmCAGTACG CCGAGTGAAT CGCTAcGGAT CGTTTTTGTC GTTTAGACCA 5760 

AGAGAGATTT GGAGATATTA AAGACATTAC AGATAAAGGA TATTATCAAA ACTCTTTCCA 5820 

TTATGATGTA CGTAAAGATG TTACACCTTT TGAAAAGTTA GATTTTGAAA AAGATTATCC 5880 

TTATTATGCG AGTGGTGGTT TCATTCACTA TTGTGAGTAT CCGAAATTGC AACACAATTT 5940 

GAAAGCACTA GAAGCGGTAT GGGACTACTC TTATGACAAA GTTGGTTACT TAGGTACAAA 6000 

TATTCCGATT GATCATTGTT ATGAATGTGA TTACGATGGA GATTTTGAAG CAACTGAAAA 6060 

AGGATTTAAA TGCCCGAACT GTGGCAATGA TAATCCTAAA ACAGTTGATG TCGTTAAACG 6120 

15 AACATGTGGT TACCTAGGCA ATCCAGTTCA ACGTCCAGTA ATTAAAGGCC GTCATAAAGA 6180 

AATTTGCGCA CGAGTAAAAC ATATGAAAGC GCCTAAAGAA TGATACTTTT AGACATTAAA 624 0 

CAAGGACAAG GTTATATTGC TAAAATAGAA TCAAATAGCT TTGTTGACGG TGAAGGAGTA 6300 

20 AGATGCAGTG TTTATGTATC AGGATGTCCA TTTAATTGTG TTGGATGTTA TAACAAAGCC 6360 

TCACAAAAGT TCAGATATGG CGAGAAATAC ACTGATGAAA TATTAGCAGA AATATTAGAT 6420 

GATTGCGATC ATGATTATAT ATCTGGGCTA AGTCTATTAG GTGGCGAACC ATTTTGTAAT 64 80 

TTGGATATTA CATTAAATCT TGTCAAAGCA TTTCGAGCAC GTTTTGGAAA TACAAAGACA 6540 

ATTTGGGTAT GGACTGGATT TTTATATGAA TATTTAGCAA ATGATTGTAC AGAACGTCGA 6600 

GAGTTATTAT CATACATTGA CGTTTTAGTA GATGGTCTAT TTATACAACA CTTATTCAAA 6660 

CCTGATTTAC CATATAAAGG TTCTTTAAAT CAACGCATTA TAGATGTACA ACAATCACTC 6720 

TCGCATGCGC GTATGATTGA ATATATAGTT AGTTGAATAT GTATTAGAAG TCAAGGTAAC 6780 

ATTCGTTGCC TTGGCTTCTT TTTAGGTTAG GTACATAATT GAAAGTTAAT AAAAGCAATT 6840 

CTTTATAAAA ATATATTGAT AGAATATGAC CTAACAATCA TTTTGATACC AATACTAAAA 6900 

GTTGCATATC CGTTTTTTAA AAAAGTTGAA AGAGAAAAGT GGTATTTTAG TGGGAAGGAA 6960 

40 GTCTAACTTT TTGGTAGCGT TTTACAATAA ATAAATATTC GTTAATAACG TATAAATATT 7020 

CTTAAATGCC ATTCTAGTAA AATTTGTTAA ATTCGTTAAA TCGTAACTTA ACACTGTTAT 70 BO 

TTTAGCGCTA TTAAGGTTTT GTTTATTACG GGAAAAATTA TATAAATATT CAATAATTGC 7140 

45 CAAGTTTCAA ATTGTATGAA ATTTGCATTA TTATTAAATG TTAGTTATTG TCAATTTTGT 7200 

GAATCAATAT AATTATTACA TTTTGAGATA AATCGAAACA GGATTCATAA AATTAATAAT 7260 

TAGGGGGAGC ACAATTGAAA AAAGAGAAAG TTATGGACTG GACGACCTTT ATAGGGACAG 7320 

50 

TAGCTGTACT TCTTTTTGCA GTTATACCTA TGATGGCTTT TCCAAAAGCA AGTGAAGATA 73 BO 

TCATCACTGG TATTAATAGT GCCATTTCTG ATTCAATTGG TTCGATATAT TTATTTATGG 7440 

55 
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TTGGTAAAGC AAGTGATAAA 


CCAGAATTTA 


ATACATTTAC ATGGGCGGCA ATGCTGTTTT 


7560 




GTGCAGGCAT AGGCTCTGAT ATTTTATACT GGGGCGTTAT 


TGAATGGGCT 


TTTTACTATC 


7620 


5 


AAGTTCCACC 


AAATGGCGCG 


AAAAGTATGA 


GTGATGAAGC 


ACTCCAATAT 


GCGACGCAAT 


7680 




ATGGTATGTT 


CCACTGGGGG 


CCAATTGCTT 


GGGCTATTTA 


TGTTCTACCA 


GCATTACCAA 


7740 


10 


TTGGTTATTT 


AGTATTTGTT 


AAAAAACAAC 


CGGTGTATAA 


AATTAGTCAA 


GCTTGTCGTC 


7800 




CGATTTTAAA AGGTCAAACA 


GATAAATTTG 


TAGGTAAAGT 


TGTAGATATC 


TTATTTATCT 


7860 




TTGGATTGCT AGGTGGTGCG 


GCAACATCAC TAGCGTTAGG TGTGCCATTA ATTTCTGCAG 


7920 


15 


GCATAGAAAG 


ATTAACTGGT 


TTAGATGGTA 


AAAATATGAT 


TTTACGTTCG 


GCCATTTTAT 


7980 




TAACAATCAC 


GGTTATATTT 


GCCATTAGTT 


CATATACAGG 


ATTGAAAAAA 


GGTATTCAAA 


8040 




AGTTAAGTGA 


TATCAACGTT 


TGGCTATCCT 


TTGTACTTTT 


AGCCTTTATA 


TTTATTATTG 


8100 


20 


GACCGACTGT 


TTTTATTATG 


GAAACGACAG 


TGACAGGGTT 


CGGAAATATG 


TTGAGAGATT 


8160 




TCTTTCATAT 


GGCAACATGG 


TTAGAACCAT 


TCGGTGGTAT 


TAAAGGTCGA 


AAAGAAACGA 


8220 




ATTTCCCACA 


AGACTGGACA 


ATATTCTACT 


GGTCATGGTG 


GTTAGTATAT GCGCCATTTA 


8280 


25 


TCGGTTTATT 


TATCGCTAGA 


ATTTCAAAAG 


GTCGACGCCT 


TAAAGAAGTC 


GTGCTAGGAA 


8340 




CAATTATTTA 


TGGAACGCTT 


GGATGCGTAT 


TATTCTTTGG 


TATTTTTGGT 


AACTATGCTG 


8400 


30 


TGTATTTACA 


AATTTCTGGA 


CAGTTTAATG 


TAACACAATA 


TTTAAATACA 


CATGGTACAG 


8460 


AGGCAACCAT 


TATTGAAGTG 


GTGCATCATT 


TACCATTCCC 


ATCATTGATG 


ATTGTACTAT 


8520 




TCTTAGTATC 


TGCTTTCTTA 


TTCTTAGCAA 


CAACATTTGA 


TTCGGGTTCA 


TATATTTTAG 


8580 


35 


CGGCAGCATC 


TCAGAAAAAA 


GTGGTAGGCG 


AACCATTACG 


TGCCAATCGT 


TTATTCTGGG 


8640 




CATTTGCATT 


GTGCTTATTG 


CCATTTTCAT 


TGATGCTAGT 


TGGTGGTGAA 


CGTGCATTAG 


8700 




AAGTATTGAA AACTGCTTCA ATACTGGCAA 


GTGTGCCATT 


AATTGTTATT 


TTTATTTTCA 


8760 


40 


TGATGATATC 


ATTTTTAATC 


ATTTTAGGGC 


GCGATAGAAT 


TAAACTTGAA ACGCGTGCTG 


8820 




AAAAATTAAA 


AGAAGTTGAA 


CGTCGTTCAT 


TGCGAATCGT 


TCAAGTATCa 


GAAGAAGAAC 


8880 




AAGACGATAA 


TTTATAATTC 


AAAGCGGGTC 


TGGGACGACG 


AAATGaATTT 


TGTGAAAATA 


8940 


45 


TCATTTCTGT 


TCCaTTCCCC 


TTTTTTTAGT 


AGCATTGTAG 


GATGAACTTT 


TAGGTTTTCA 


9000 




TTAATGTTGT 


ACTAAAAGAT 


TTAATTTTTT 


AGTGCTCCAA 


GTACTTATTT 


ATTGTATGAA 


9060 


50 


GCATATTCTA 


AATCGAAGTT 


TGAAAGACTC 


TCATTGATTA 


TTAAATTAAA 


TAAAGGGTAT 


9120 


GCGTATGTAC 


AATTCAAATT 


AATCGAAGGA 


TGAAATAAAA 


TGACTAATCA 


ATTTAAAAAT 


^9180 




AAACAGTCCA 


AATTACATGA 


CAGTTTAGAA 


TCCATCACAA 


AAAACTTATA 


TGCGACACCT 


9240 
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ACAGAATATT GTTATCTATC ATTCCGGACA CTTAGGTGAC TCCCAACAAG ACATTGCATC 9360 

ATTAGGTGGT GTTTCAAAAG TATTGATGAA TCATGATCAT GAATCTATAG GAGGTTCTAA 9420 

5 

TCAAGTTGAA GCCCCTTACT TTATACATGA AAATGATGTG GCTGCACTGA AACATAAGAT 94 80 

TTCTGTTCAA AAACAATTTA GTAATCGTGT AATGTTGGAT AAGGATTTAG AAGTTATTCC 9540 

CGCGCCTGGA CATACACCAG GGACGACACT ATTTTTATGG GATGATGGTC ATCACCGTTA 9600 

10 

CTTATTTACT GGAGATTTTA TATGTTTTGA AGGGAAGAGA TGGCGTACAG TTATATTAGG 9660 

TTCAAGTGAT AGAGAAAAAT CTATTCAAAG TTTAGAGATG GTTAAAGAAT TAGATTTTGA 9720 

15 TGTACTTGTA CCTTGGGTTA CTATCAAAGA TGAACCGTTA GTTTATTTTG TAGAAAATGA 9780 

ATATGAAAAA CGTGAACAAA TACAAAATAT TATTGATAGA GTACGTGAGG GCGAGAATAG 984 0 

CTAATTGAAA TATATTGGCG AAgCAATGTA ACGAATCTAA GAAAGCCCTA GAAAATACCT 9900 

20 CCATAATTGA TTGTCATATA AAACAAAAAC GGTAATTTCT ATTTATTGAG ATAGAAATTA 9960 

CCGTTTATTT CGTGGACCTA TTGCATTGTT TTTATCATGC ATAATCATCA TTGTCGTTGT 10020 

TTGAGTCAAT TTTAATTTTC AGAATCAGAA GGCTGTTCTG GAATTGGGAA ATATTTGAAA 10080 

25 

ATTTCACCGC TTTCAATCGC TTCGGTTAAC TGTTCTAACC ATTCGTAATA AACATGTGTA 10140 

TGATCAAGCT GAGCTTTAAT TTTTTGTGCC TCTTGTGTTT CAGCTTCAGT TAAATCACTG 10200 

CTTTCAAGTA ATGGATTGAT AATAGCTTGA GCATCTTTTA CTGCTTCGAC ATTGATGTCA 10260 

30 

ATTTCACGCT GGAATTTTTT AGTGAAAAAG TTTCGGAAAA AGATGAAAAA GTCTTTCTCG 10320 

GCGATAAAAT GTTGTTTGCG GCTTCCTCTC GTAAATTGTT GTTTAACAAT ATCAAATTCC 10380 

TGCAATTTCT TAACGCCAGC ACTCATACTT GGTTTGCTCA TTTGCAATTG ATGACGCATT 10440 

35 

TCATCAAGCG TCATACTGCC TTCAAACACC ATTGTGCCAT ATAAGTTTCC TACACTTCTA 10500 

TTAGTGCCAT ACAAATCCAT TGTCTGTCCA ATTGAATTAA TTACAATATC TTTTGCTTGT 10560 

40 TCTAATTGTT GCTGTTTGTT CTGAGAACGA GTCATCATTG CACCTCCGTA CATCATTTTG 10620 

GTCACGTTAA AATAAATACT AATACATTAT AAAACCTTTT CTAAAAAAAG ACATTAAAAA 10680 

TATTTAAAGC ATTAAAGTTA AATGTTTCGT TAAATAAAAA TCTAACGAAC TTACAAAACT 10740 

45 . TAATTCTTGA GTTGTTTTGT AAATTGACAC ATTTTTCATT TCTATGCTAA CATAAGTnTG 10800 

TAAAATTcGT TAAATAAAAA TTTAACAAAC TTAACGGrGG TTGTTGAAkG GrACTTTTAA 10860 

aACATTTATC TCAGCGTCAA TATATTGATG GTGAGTGGGT TGAAAGCGCG AATAAAAATA 10920 

50 

CAAGAGATAT TATCAATCCT TACAATCAAG AAGTGATATT TACGGTTTCT GAAGGGACAA 10980 

AAGAGGATGC' AGAACGTGCA ATCTTAGCTG CAAGACGTGC GTTTGAGTCT GGTGAATGGT 11040 
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AACATCgCGA AgCgTTAGCA CGATTAGAAA 


CATTAGATAC TGGAAAAACG 


TTAGAAGAAT 


11160 




CATATGCAGA TATGGATGAT ATTCATAATG 


TGTTTATGTA TTTTGCTGGA 


TTAGCAGATA 


11220 


$ 


AAGACGGTGG 


CGAAATGATT GATTCACCAA 


TTCCAGATAC AGAAAGCAAA 


ATTGTTAAAG 


11280 




AACCAGTAGG 


TGTAGTTACA CAAATTACAC 


CTTGGAATTA TCCGTTATTA 


CAAGCATCAT 


11340 


10 


GGAAAATTGC 


GCCAGCGCTT GCTACGGGTT 


GTTCACTAGT TATGAAACCA 


AGTGAAATTA 


11400 




CACCATTAAC 


AACAATACGT GTTTTTGAAT 


TAATGGAAGA AGTTGGTTTC 


CCTAAAGGAA 


11460 




CAATTAATCT 


TATTCTAGGT GCAGGTTCTG 


AAGTTGGTGA CGTAATGTCA 


GGTCATAAAG 


11520 


15 


AGGTTGACCT 


TGTATCATTT ACAGGTGGCA 


TTGAGACTGG TAAGCATATT 


ATGAAAAATG 


11580 




CTGCTAATAA TGTTACGAAT ATTGCCTTGG AACTTGGCGG TAAAAATCCA AACATTATCT 


11640 




TTGATGATGC 


TGATTTTGAA TTGGCAGTAG ACCAAGCGTT AAATGGTGGA TATTTCCATG 


11700 


20 


CAGGTCAAGT 


TTGTTCAGCA GGATCAAGAA TATTAGTACA AAACAGTATT AAAGACAAAT 


11760 




TTGAGCAAGC 


ACTTATTGAT CGCGTGAAAA 


AAATCAAATT AGGTAATGGT 


TTTGATGCTG 


11820 




ATACTGAAAT 


GGGACCAGTG ATTTCAACAG 


AACATCGTAA TAAGATCGAA 


TCTTATATGG 


11880 


25 


ATGTAGcTAA 


AGCAGAAGGC GCAACAATTG 


CTGTTGGTGG TAAACGTCCA 


GATAGAGATG 


11940 




ATTTAAAAGA 


TGGTCTATTC TTCGAGCCAA 


CAGTCATTAC AAATTGTGAT 


ACGTCAATGC 


12000 


30 


GTATTGTACA 


AGAAGAGGTT TTCGGACCTG 


TCGTTACTGT AGAAGGCTTT 


GAAACTGAAC 


12060 


AAGAAGCGAT 


TCAATTAGCG AATGATTCTA 


TATATGGTTT AGCAGGTGCT 


GTATTTTCTA 


12120 




AAGATATTGG 


AAAAGCACAA CGCGTTGCTA 


ACAAGTTGAA ACTTGGAACG 


GTGTGGATTA 


12180 


35 


ATGATTTCCA 


TCCATATTTT GCACAAGCGC 


CATGGGGTGG ATACAAACAA 


TCAGGTATCG 


12240 




GTAGAGAATT 


AGGCAAAGAA GGCTTAGAAG 


AGTACCTTGT TTCAAAACAC 


ATTTTAACAA 


12300 




ATACAAATCC 


ACAATTAGTG AATTGGTTTA 


GCAAATAAAA ATTAGATAAG 


GTGAGTGCCA 


12360 


40 


TTGTAAGAAC 


ACAAGACACT CACTTTGTTT 


TGTATAAGTG GCGAAATGTT 


GATTGATAAT 


12420 




TTGGACTAAA 


CGCAAAATGA ATCATAGATT ATTTCATTAC TGTTAGTAAC AATCGTAAAA 


12480 




GGAAAAGCGA 


GTGTTTTGGT TAGCTAAGTT 


TAGCAATTCA ACGATAACCA ATCAGCCACT 


12540 


45 


AACAAATATT 


TCATGCAATA CTCACTTTGA 


AATACAACAA ACTTTGGAGG 


TCATAACGAT 


12600 




GAGTAACAAA 


AACAAATCAT ATGATTATGT 


CATCATTGGA GGAGGCAGTG 


CAGGTTCTGT 


12660 


SO 


ACTAGGTAAT 


CGTCTGAGTG AAGATAAAGA 


TAAAGAAGTC TTAGTATTAG 


AAGCGGGTCG 


12720 


CAGTGATTAT 


TTTTGGGATT TATTTATCCA 


AATGCCTGCT GCGTTAATGT 


TCCCTTCAGG 


12780 




CAATAAATTT 


TACGATTGGA TTTATTCAAC 


AGATGAAGAA CCACATATGG 


GCGGTCGTAA 


12840 
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TCAACGTGGT AATCCAATGG ACTATGAAGG CTGGGCAGAA CCAGAAGGTA TGGAAACTTG 12960* 

GGATTTTGCG CACTGTTTAC CGTATTTTAA AAAATTAGAA AAAACATACG GTGCAGCGCC 13020 

5 

TTATGATAAA TTTAGAGGCC ATGATGGACC AATTAAGTTA AAACGAGGGC CAGCAACGAA 13080 

TCCTTTATTC CAGTCATTCT TTGATGCAGG TGTTGAAGCA GGCTATCATA AAACACCTGA 1314 0 

TGTGAATGGA TTTAGACAAG AAGGTTTTGG ACCGTTCGAT AGTCAAGTAC ATCGTGGTCG 13200 

10 

CCGAATGTCA GCTTCAAGAG CATATTTACA TCCAGCGATG AAGCGTAAAA ACTTAACCGT 13260 

TGAAACACGT GCCTTTGTAA CTGAAATTCA TTATGAAGGT AGAAGAGCAA CTGGTGTTAC 13320 

15 GTATAAGAAA AATGGCAAAC TA CAT AC CAT CGATGCTAAT GAAGTCATTT TGTCTGGTGG 13380 

GGCATTCAAT ACGCCACAAT TACTACAATT ATCTGGTATC GGTGATTCAG AGTTCCTAAA 13440 

ATCAAAAGGC ATTGAGCCAC GTGTTCATTT ACCTGGTGTG GGTGAAAACT TTGAAGATCA 13500 

20 CTTAGAGG 13508 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7646 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 



35 



40 



45 



50 



GTAAGTATTG 


TCTTGATTTC 


CTAATAAAGT TATATCTTGT 


AATTCATCTT GTTGACGGCC 


60 


ATGTGCCATA 


TAAAGCGCTC 


CTTTAAATTT ATTTTTTTAT 


TATTTTGGCG 


TCTCGGCGTG 


120 


CTTTTTCAAA 


CATGTAATAA 


CTTGCACCGA TAATAACGAC 


GTAACCTAAT 


GTTGCATAGA 


1B0 


AATCfTGGAGA 


TTCTCCGAAT 


AGAATAAATC CAAGTATTGC 


TGTGAAAATT 


ATAGATGCAT 


240 


ACGTAAAAAT 


AGAAATATCT 


TTTGCTGCTG CAAAACTATA 


TGCTAAAGTA 


ACACCAATTT 


300 


GACCCACAGC 


GGCAgCTAAG 


CCAGCCCCTA ATAGATAAAG 


TATTTGCATC 


TGACTCATTG 


. 360 


GTTCATAAGT 


ATATGCAGTG 


AAAGGTATTA AAACGATGAC 


AGAAAATAAG 


GAGAAGTAAA 


420 


ATACTATAGT 


ATATGGTGCT 


TyTCTTGTAC TAAGTGCTCG 


AACACATGTA 


TATGCTGATG 


480 


CTGCAAAAAT 


ACCTGAGAAT 


AAGCCAGCTA ATGATGGAAT 


CATAGATGAT 


GAAAATTCAG 


54 0 


GTTTCACTAT 


TAAnAGCAaC 


CTAAAATAGC AATTATCATT 


GCTGTAATTT 


GaTACTTCCT 


600 


TACCTTTTCA 


TGtAAGAaaA 


CAATGCTTaA TAAAATCGTC 


CAGAAAGGAT 


TGAGTTTCAT 


660 


TAATGAATCG 


GCATCACTAA 


GTACCATATG ATCAATGGCA 


TAAATATTTA 


ACAATACACC 


720 
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TGGCTGATGG 


TATTTATATA 


TAAAAAATAA 


TGGAATAAAC 


ATTGCTACTA AGTTTCGTGC 


840 


5 


TAATGATTTT 


TGAAAAACAG 


GAAGGTCACC 


TGCAAGTCTG 


AAAAAGACTG 


ACATAAAACT 


900 




GAAACCAATA 


GCCGAAATTA 


AAATGGCAAT 


GATACCTTTT ACTTTAGGAT 


TCAATTTTAT 


960 




CGCCTCTTTT 


ATATAAAATT 


AACGTATTTA 


TATTAGCATA 


AAACAACATG 


TTGTGCATAA 


1020 


10 


ATAGTTGAAA 


TTTACTATAA 


AAAGACTATA ATAGACTGTA 


GCGAACAAAC 


GTTCTGTGTT 


1080 




TATTTGTCGG 


AATAATAGGG 


CATTACACTT 


TTATGAATGT 


TTGTGTTATT 


ACATAAAACA 


1140 




AATATCAATT 


CAGTATCAAG 


CTAATAAGCT 


TTTTCTTGAT 


TTCTGTTGAT 


ACAATTGAGA 


1200 


15 


TTGACACAGA 


TTTAAAAAAA 


TCAAGTGATA 


TCTACTAAAA 


AATTTTTTTA 


AATTTGTTCA 


1260 




AGTTTTTCTA 


ATTTAGTATT 


GGTGCCTAGT 


TGGAACGTTT 


TACGAACATT 


CGATTAGAAA 


1320 




ATGGCACTTT 


AAATCATAGT 


GTGTCTTATG 


TATAATGAAA 


CACATAATAT 


AGTGTTGGTG 


1380 


20 


AAACGAAAAA gACACAATAT 


CTTGTGTTTT 


GTATGCAAAT 


GCTTTATTTA 


TGAAGAAATT 


1440 




ACATTTAAAA 


GTAATTTAAC 


ACAGAAATTT 


AATAGTTATT 


ATCAATTAAT 


AGTCATATTT 


1500 


2$ 


TTAGAAAATG 


TACTGAGCAA 


ATGGAAGATA 


TCCAATGATG 


TAAACACTAC 


ATATAGTGAT 


1560 


TTTTATACAT 


TCAACCCATA 


TAAGCTACTA 


TTTTCTCAAA 


TATAAATCTA 


TGCAATTGGT 


1620 




TTACATTTGA 


GAAAATAAGT 


AGCTTCATTA 


TAGTTAATAC 


AATGCTGAGA 


TAAC CATAGT 


1680 


30 


AACCATGTTG 


TTAAAGCATT 


TTTTAATTGG 


AATGACTACT 


TTATTTAAAA 


GGGTTGAAGA 


1740 


AAGAAGGTGA 


TCCAATGAAA 


ATAATATATT 


TTTCATTTAC 


TGGAAATGTC 


CGTCGTTTTA 


1800 




TTAAGAGAAC 


AGAACTTGAA 


AATACGCTTG 


AGATTACAGC 


AGAAAATTGT 


ATGGAACCAG 


1860 


35 


TTCATGAACC 


GTTTATTATC 


GTTACTGGCA 


CTATTGGATT 


TGGAGAAGTA 


CCAGAACCCG 


1920 




TTCAATCTTT 


TTTAGAAGTT 


AATCATCAAT 


ACATCAGAGG 


TGTGGCAGCT 


AGCGGTAATC 


.1980 




GAAATTGGGG 


ACTAAATTTC 


GCAAAAGCGG 


GTCGCACGAT 


ATCAGAAGAG 


TATAATGTCC 


2040 


40 


CTTTATTAAT 


GAAGTTTGAG 


TTACATGGAA 


AAAACAAAGA 


CGTTATTGAA 


TTTAAGAACA 


2100 




AGGTGGGTAA 


TTTTAATGAA 


AACCATGGAA 


GAGAAAAAGT 


ACAATCATAT 


TGAATTAAAT 


2160 




AATGAGGTCA 


CTAAACGAaG 


AGAAGATGGA 


TTCTTTAGTT 


TAGAAAAAGA 


CCAAGAAGCT 


2220 


45 


TTAGTAGCTT 


ATTTAGAAGA 


AGTAAAAGAC 


AAAACAATCT 


TCTTCGACAC 


TGAAATCGAG 


2280 




CGTTTACGTT 


ATTTAGTAGA 


CAACGATTTT 


TATTTCAATG 


TGTTTGATAT 


TTATAGTGAA 


2340 


SO 


GCGGATCTAA 


TTGAAATCAC 


TGATTATGCA 


AAATCAATCC 


CGTTTAATTT 


TGCAAGTTAT 


2400 


ATGTCAGCTA GTAAATTTTT 


CAAAGATTAC 


GCTTTGAAAA 


CAAATGATAA 


AAGTCAATAC 


2460 




TTAGAAGACT 


ATAATCAACA 


CGTTGCCATT 


GTTGCTTTAT 


ACCTAGCAAA 


TGGTAATAAA 


2520 
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5 



10 



15 



30 



35 



40 



45 



50 



ACATTTTTAA 


ACGCAGGCCG 


TGCGCGTCGT 


GGTGAGCTAG 


TGTCATGTTT 


CTTATTAGAA 


2640 


GTGGATGACA 


GCTTAAATTC 


AATTAACTTT 


ATTGATTCAA CTGCAAAACA 


ATTAAGTAAA 


2700 


ATTGGGGGCG 


GCGTTGCAAT 


TAACTTATCT 


AAATTGCGTG 


CACGTGGTGA 


AGCAATTAAA 


2760 


GGAATTAAAG 


GCGTAgCGAA AGGCGTTTTA 


CCTATTGCTA AGTCACTTGA 


AGGTGGCTTT 


2820 


AGCTATGCAG 


ATCAACTTGG 


TCAACGCCCT 


GGTGCTGGTG 


CTGTGTACTT 


AAATATCTTC 


2880 


CATTATGATG 


TAGAAGAATT 


TTTAGATACT 


AAAAAAGTAA ATGCGGATGA 


AGATTTACGT 


2940 


TTATCTACAA 


TATCAACTGG 


TTTAATTGTT 


CCATCTAAAT 


TCTTCGATTT 


AGCTAAAGAA 


3000 


GGTAAGGACT 


TTTATATGTT 


TGCACCTCAT 


ACAGTTAAAG AAGAATATGG 


TGTGACATTA 


3060 


GACGATATCG 


ATTTAGAAAA 


ATATTATGAT 


GACATGGTTG 


CAAACCCAAA 


TGTTGAGAAA 


3120 


AAGAAAAAGA 


ATGCGCGTGA 


AATGTTGAAT 


TTAATTGCGC 


AAACACAATT 


ACAATCAGGT 


3180 


TATCCATATT 


TAATGTTTAA 


AGATAATGCT 


AACAGAGTGC 


ATCCGAATTC 


AAACATTGGA 


3240 


CAAATTAAAA 


TGAGTAACTT 


ATGTACGGAA 


ATTTTCCAAC 


TACAAGAAAC 


TTCAATTATT 


3300 


AATGACTATG 


GTATTGAAGA 


CGAAATTAAA 


CGTGATATTT 


CTTGTAACTT 


GGGCTCATTA 


3360 


AATATTGTTA 


ATGTAATGGA 


AAGCGGAAAA 


TTCAGAGATT 


CAGTTCACTC 


TGGTATGGAC 


3420 


GCATTAACTG 


TTGTGAGTGA 


TGTAGCAAAT 


ATTCAAAATG 


CACCAGGAGT 


TAGAAAAGCT 


3480 


AACAGTGAAT 


TACATTCAGT 


TGGTCTTGGT 


GTGATGAATT 


TACACGGTTA 


CCTAGCAAAA 


3540 


AATAAAATTG 


GTTATGAGTC 


AGAAGAAGCA 


AAAGATTTTG 


CAAATATCTT 


CTTTATGATG 


3600 


ATGAATTTCT 


ACTCAATCGA 


ACGTTCAATG 


GAAATCGCTA AAGAGCGTGG 


TATCAAATAT 


3660 


CAAGACTTTG 


AAAAGTCTGA 


TTATGCTAAT 


GGCAAATATT 


TCGAGTTCTA 


TACAACTCAA 


3720 


GAATTTGAAC 


CTCAATTCGA 


AAAAGTACGT 


GAATTATTCG 


ATGGTATGGC 


TATTCCTACT 


3780 


TCTGAGGATT 


GGAAGAAACT 


ACAACAAGAT 


GTTGAACAAT 


ATGGTTTATA 


TCATGCATAT 


3840 


AGATTAGCAA 


TTGCTCCAAC 


ACAAAGTATT 


TCTTATGTTC 


AAAATGCAAC 


AAGTTCTGTA 


3900 


ATGCCAATCG 


TTGACCAAAT 


TGAACGTCGT 


ACTTATGGTA ATGCGGAAAC 


ATTTTACCCT 


3960 


ATGCCATTCT 


TATCACCACA 


AACAATGTGG 


TACTACAAAT 


CAGCATTCAA 


TACTGATCAG 


4020 


ATGAAATTAA 


TCGATTTAAT 


TGCGACAATT 


CAAACGCATA 


TTGACCAAGG 


TATCTCAACG 


40B0 


ATCCTTTATG 


TTAATTCTGA 


AATTTCTACA 


CGTGAGTTAG 


CAAGATTATA 


TGTATATGCG 


4140 


CACTATAAAG 


GATTAAAATC 


ACTTTACTAT 


ACTAGAAATA AATTATTAAG 


TGTAGAAGAA 


4200 


TGTACAAGTT 


GTTCTATCTA 


ACAATTAAAT 


GTTGAAAATG 


ACAAACAGCT 


AATCATCTGG 


4260 


TCTGAATTAG 


CAGATGATTA 


GACTGCTATG 


TCTGTATTTG 


TCAATTATTG 


AGTAACATTA 


4320 
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ATGTTTTGGA 


GACAAAATAT 


ATCTCAAATG 


TGGGTTGAAA CAGAATTTAA 


AGTATCAAAA 


4440 




GACATTGCAA 


GTTGGAAGAC TTTATCTGAA GCTGAACAAG ACACATTTAA AAAAGCATTA 


4500 


5 


GCTGGTTTAA 


CAGGCTTAGA 


TACACATCAA GCAGATGATG GCATGCCTTT 


AGTTATGCTA 


4560 




CATACGACTG 


ACTTAAGGAA 


AAAAGCAGTT 


TATTCATTTA TGGCGATGAT 


GGAGCAAATA 


4620 


10 


CACGCGAAAA 


GCTATTCACA TATTTTCACA ACACTATTAC CATCTAGTGA 


AaCAAACTAC 


4680 




CTATTAGATG 


AATGGGTTTT 


AGAGGAACCC 


CATTTAAAAT ATAAATCTGA 


TAAAATTGTT 


4740 




GCTAATTATC 


ACAAACTTTG 


GGGTAAAGAA GCTTCGATAT ACGACCAATA TATGGCCAGA 


4800 


15 


GTTACGAGTG 


TATTTTTAGA 


AACATTCTTA 


TTCTTCTCAG GTTTCTATTA 


TCCACTATAT 


4860 




CTTGCTGGTC 


AAGGGAAAAT 


GACGACATCA 


GGTGAAATCA TTCGTAAAAT 


TCTTTTAGAT 


4920 




GAATCTATTC 


ATGGTGTATT 


TACCGGTTTA 


GATGCACAGC ATTTACGAAA 


TGAACTATCT 


4980 


20 


GAAAGTGAGA 


AACAAAAAGC 


AGATCAAGAA 


ATGTATAAAT TGCTAAATGA 


CTTGTATTTA 


5040 




AATGAAGAGT 


CATACACAAA 


AATGTTATAC 


GATGATCTTG GAATCACTGA 


AGATGTGCTA 


5100 




AACTATGTTA 


AATATAATGG 


AAACAAAGCA 


CTTTCAAACT TAGGCTTTGa 


ACCTTATTTT 


5160 


25 


GAGGAACGTG 


AATTTAACCC 


AATCATTGAG 


AATGCCTTAG ATACAACAAC 


TAAAAACCAT 


5220 




GACTTCTTCT 


CAGTAAAAGG 


TGATGGTTAT 


GTATTAGCAT TAAACGTAGA 


AGCATTACAA 


5280 


30 


GATGATGACT 


TTGTATTTGA 


CAACAAATAA 


CAATTAAATT AAAAGACCTT 


CACATGTAAA 


5340 


GGGAAATAGC 


GATTCGTTTC 


GTCTTGTCTC 


CTACATGTTG AAGGTCTTTT 


TTTATGTGTA 


5400 




TCTAACTCAT 


TATGAGTCTG 


AGTAAGAAAT 


CAATGCTCTA AGATGTACAA 


TGCTATTTAT 


5460 


35 


ATTGGCAGTA 


GTTGGCGGGG 


CCCCAACACA 


GAAGCAGGCG GAAAGTCAGC 


TAACAATATT 


5520 




GTGCAAGTTG 


GCGGGGCCCC 


AACATAGAAG 


CAGGCGGAAA GTCAGCTAAC 


AATAATGTGC 


5580 




AAGTTGGCGG 


GGCCCCAACA 


TAAAAGCAGG 


CGGAAAGTCA GCTAACAATA 


TTGTGCAAGT 


5640 


40 


TCGGgCGGGG 


CCCCAACATA 


AAGAAAAACT 


TTTTCCTTTA GAAATTATCA 


CTTCCaCaTG 


5700 




AGTTTTACTC 


ATGTATTCCT 


ATTTTTAAGT 


ACACATTAGC TGAGGCTAAT 


GTTAAGAACC 


5760 




ACTACTTAAT 


CAATCATTAG 


TAGTTTTTAT 


CATTTCCACT ATTCCCaGAC 


ATCaAAATCT 


5820 


45 


TAAGTGTTCT ATTTTACTTT 


AAGTAAACAA 


AATACACATT CCGAAAAATT 


AAATTTCAGT 


5680 




TTAATTGCAA 


ATATCAATAA 


AATTGACACT 


AAATTATTTG AAAGGCTATT 


GAAATTATGG 


5940 


50 


TCAAAAAACG 


OTACTATTAA 


TGAGAAATAT 


TATCAATGAT AATGATTATC 


ATTAATTTAA 


6000 


' AGGGAGAAAA 


ATTTGTAATG 


AAGTATTTAT 


TAAAGGGAAA TATTTTGCTT 


CTATTACTAA 


6060 




TATTGTTGAC 


AATTATTTCG 


TTGTTCATAG 


GTGTGAGTGA ACTATCAATT 


AAAGATTTAC 


6120 
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5 



10 



IS 



30 



35 



40 



45 



GTATTTTAAT 


TGCTGGAAGT 


TCGTTGGCTT 


TAGCAGGCTT 


GATAATGCAA 


CAAATGATGC 


6240 


AAAATAAGTT 


TGTTAGTCCG 


ACTACAGCTG 


GAACGATGGA ATGGGCTAAA 


CTAGGTATTT 


6300 


TAATTGCTTT 


ATTGTTCTTT 


CCAACCGGTC 


ATATTTTATT AAAACTAGTA 


TTTGCTGTTA 


6360 


TTTGCAGTAT 


TTGCGGTACG 


TTTTTATTTG 


TTAAAATCAT TGATTTTATA AAAGTGAAAG 


6420 


ATGTCATTTT 


TGTACCGCTT 


TTAGGAATTA 


TGATGGGTGG 


GATTGTTGCA AGTTcACAAC 


6480 


CTTCATCTCA 


TTGCGCACGA 


ATGCTGTTCA 


AAGCATTGGT AACTGGCTTA ACGGGAACTT 


6540 


TGCCATTATC 


ACAAGTGGAC 


GCTATGAAAT 


TTTATATTTA AGTATTCCTC 


TTTTAGCATT 


6600 


GACATATCTT 


TTTGCTAATC 


ATTTCACGAT 


TGTAGGAATG GGTAAAGACT 


TTACTAATAA 


6660 


TTTAGGTTTG 


AGTTACGAAA 


AATTAATTAA 


CATCGCATTG TTTATTACTG 


CAACTATTAC 


6720 


AGCATTGGTA 


GTGGTGACTG 


TTGGAACATT 


ACCGTTCTTA GGACTAGTAA 


TACCAAATAT 


6780 


TATTTCAATT 


TATCGAGGTG 


ATCATTTGAA 


AAATGCTATC 


CCTCATACGA 


TGATGTTAGG 


6840 


TGCCATCTTT 


GTATTATTTT 


CTGATATAGT 


TGGCAGAATT 


GTTGTTTATC 


CATATGAAAT 


6900 


AAATATTGGT 


TTAACAATAG 


GTGTATTTGG 


AACAATCATT 


TTCCTTATCT 


TGCTTATGAA 


6960 


AGGTAGGAAA 


AATTATGCGC 


aACAATAATA 


AAAAAATAAT 


GCTTTTAATT 


GCAGTAACGT 


7020 


TATTAATTAG 


TATGCTGTAC 


TTATTTGTAG 


GTATTGATTT 


TGAAATATTT 


GAATATCAAT 


7080 


TTTCAAGTCG 


TTTAAGAAAG 


TT CAT ATT AA 


TTATTTTAGT 


AGGTGCTGCC 


ATTGCAACTT 


7140 


CAGTGGTGAT 


TTTTCAAGCG 


ATTACAAATA 


ACCGTCTATT 


GACACCATCA 


ATAATGGGGT 


7200 


TAGATGCAGT 


TTATTTATTT 


ATCAAAGTAT 


TGCCAGTCTT 


TTTATTTGGA 


ATTCAATCGG 


7260 


TATGGGTTAC 


TAATGTATAT 


TTGAACTTTA 


TATTAACACT 


TATAACGATG 


GTGTTATTCG 


7320 


CACTAATCCT 


ATTCCAAGGT 


ATCTTTAAAA 


TCGGACATTT 


TTCAATTTAT 


TTTATCTTAC 


7380 


TTA3TGGTGT 


CCTTTTAGGA 


ACATTTTTTA 


GAAGCATAAC 


AGGTTTTATT 


CAACTGATTA 


7440 


TGGATCCTGA 


GTCATTTTTA 


GCAATACAAA 


GTAGTATGTT 


TGCTAATTTT 


AATGCTTCTA 


7500 


ATTCGAATTT 


AGTTACTTTC 


TCAGCAGTGC 


TATTAGTAAT 


CTTATTAGTC 


ATTACAATTT 


7560 


TACTATTGCC 


TTATTTAGAT 


GTATTGCTTT 


TAGGTCGTGC TGAAGCAATT 


AATCTTGGGA 


7620 


TATCGTATGA 


AAAATTAACG 


CGAATT 








7646 


(2) INFORMATION FOR SEQ ID NO: 122: 









(i) SEQUENCE CHARACTERISTICS : 
so (A) LENGTH: 1194 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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10 



15 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

ATGAATATAT TTnnAAATAA ATTATTATGG ATTGCACCAA TnGCCACTAT GATTATCTTG 60 

GTAATCTTTT CTTTAGCTTT TTATCCTGCA TATAATCCTA AACCAAAAGA TTTACCAATT 120 

GGTATATTAA ACGAGGATAA AGGTACAACG ATTCAAGATA AAAATGTTAA CATTGGTAAA 180 

AAATTAGAGG ATAAATTATT AGATAGTGAT TCTAATAAAA TTAAATGGGT TAAGGTTGAT 24 0 

AGTGAAAAAG ACCTTGAAAA AGATTTGAAA GATCAAAAAA TCTTTGGAGT AGCTATTATT 300 

GATAAAGACT TTTCAAAAGA TGCTATGAGT AAAACACAAA AAGTAGTTAT GGATAGTAAA 360 

AAAGAAGAAA TGCAACAAAA AGTTGCTTCA GGTGAAATTC CGCCACAAGT GGTTCAACAA 420 

ATGAAACAAA AAATGGGGAA TCAACAAGTA GAGGTTAAGC AGGCTAAATT TAAAACGATT 4 80 

GTAAGTGAAG GATCAAGCTT ACAAGGTTCA CAAATTGCAT CAGCTGTGTT AACTGGTATG 540 

20 GGTGATAATA TTAATGCTCA AATTACGAAG CAAAGTTTGG AAACATTAAC GAGTCAAAAT 600 

GTTAAAGTCA ATGCCGCGGA CATCAATGGT TTGACGAATC CAGTAAAAGT GGATAATGAA 660 

AAACTTAATA AAGTTAAAGA TCACCAAGCA GGTGGTAATG CACCATTCCT AATGTTTATG 720 

CCAATTTGGA TAGGTTCAAT CGTAACGTCT ATCTTATTGT TCTTTGCATT TAGAACTAGT 780 

AACAATATCG TCGTGCAACA TCGTATCaTT GCtTCAATTG GACAGATGAT ATTTGCAGTT 84 0 

GTTGCAGCAT TTGCAGGTAG CTTTGTTTAT ATTTATTTCA TGCAAGGCGT TCAAAGATTT 900 

GATTTTGACC ATCCAAATCG TATCGCAATT TTTGTAGCAT TTGCGATTCT TGGTTTCGTG 960 

GGCCTTATTT TAGGTGTTAT GGTATGGCTA GGTATGAAGT CAGTTCCAAT TTTCTTCATT 1020 

TTAATGTTCT TTAGTATGCA ACTTGTAACG TTACCTAAAC AAATGTTGCC TGAAAGTTAT 1080 

CAAAAATATG TATATGATTG GAATCCATTC ACACACTATG CAACAAGTGT AAGAGAcTAT 114 0 

TATAdTGAA TCATCATATT GAATTAAATA GTACAATGTG GATGTTTATA GGGT 1194 
40 (2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 558 base pairs 

(B) TYPE: nucleic acid. 

(C) STRANDEDNESS : double 
45 <D) TOPOLOGY: linear 



25 



30 



35 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

50 

GACCGACCTA TACATCCGTA TAAGTATTTC TTGATATAAG TCTTCTAAAT CATAATGATT 60 
AAATCCAAAT GTTTTGATGC GTCGAATAAT TAATGGTTGT AGATCCATTA CTAACTTTTC 120 
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GTATTTCAAA TATTAAACTA ACCCCTTCTA TCTAAAATTT AAGGTTAGTT TAATATTGTT 240 
ACATTCAAAA TTTCAAGATG ACGGAAATGT CATTTCTTAT GATGTCCTCT TCGTATTTTT 300 
5 TCAAATTCTG CAAGGATTTC AGAAGATAAC GGAATTCGAG TTCTTGGCTT GTTTTCACTT . 360 

ATATCATCTA ATGATTTACT CACATCAATT TCATTTTCTT TTAAATCTCT CCACATTTCG 420 
CGAGATGATA TTCTATATGC ACCTGATCCA AAGATAGCAT GTTGcTCACT CaTATCACTT 480 

10 

GTTACAACTG TAATATGcTT AGtATGCTTG tCaTAAAGtT CaTAAACCAT AACGGTTCTA 540 
ATGGAAACCA ATCAGCTG 558 
(2) INFORMATION FOR SEQ ID NO: 124: 

15 

(i) SEQUENCE CHARACTERISTICS : , 

(A) LENGTH: 7762 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 

P0 <D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



25 


GCTTCAGACA 


TnTGATGATA 


TAATCTCTCA 


TCATCGATTA ATTCTTTTGC AGCTTGATAC 


60 




ACATnTTGCT 


TATTTGTTCC 


AATGACTTTT 


AATGTGCCAG 


CTTCAACACC 


TTCAGGACGT 


120 




TCTGTAACAC 


TTCGCCAAAA 


CTAAAACTGG 


CTTATTAAAT 


GATGGCGCTT 


CTTCCTGAAT 


180 


30 


. TCCACCTGAA 


TCTGTCAAAA 


TAAAATAAGA 


TTTTnTAGCA AAATTATGGA AATCTATACG 


240 




TCCAAAGGTT 


CAATCAATTC 


AATTCTGTCA 


TGACTACCTA AAATCTTTTG 


AGCCACCTCT 


300 


35 


CGAACTTTCG 


GGTTTTTATG 


CATTGGATAT 


ACCAGTGCTA AATCAGTATA 


CTCATCTATT 


360 


AAGCGTCTAA 


CCGCTTTAAA 


TATATTTTCC 


aTGGGTTTCC 


CGATATTTTC 


TCGTCGGTGT 


420 




GCTGTCATrA 


GAATGAATTT 


kTtGTCATGG 


TATTTATCCA 


TGATGTTAGA 


TTTATAATTG 


480 


40 


TCATCAACTG 


TATATTTCAT 


AGCATCAATC 


GCAGTATTAC 


CAGTGACAAC 


AACACTTTCT 


540 




GAATATTTCC 


CTTCACTTAA 


CAAATGCGAT 


GCAGCATTTT 


TAGTAGGTGC 


AAAATGTAAG 


600 




TCAGCTAATA 


CACCAACTAA 


TTGTCTATTC 


ACCTCTTCTG 


GAAAAGGTGA 


ATATTTATCA 


660 


45 


TAACTTCTAA 


GCCCTGCTTC 


AACGTGTCCA 


ATCGGCACTT 


GGTTATAAAA 


TGCCGCTAAA 


720 




CCACCTGCAA 


ATGTCGTCAT 


CGTATCACCA 


TGTACAAGTA 


CCATGTCTGG 


TTTTTCTAAT 


780 




TGAATCACTT 


GTTCTAATTG 


AGTGATTGAT 


TTAGAAGTTA TCTCAGAAAG 


TGTCTGTCCT 


840 


SO 


GATTTCATAA 


TATTCAAATC 


GTATTTTGGT 


TTGATTTCAA AGGTACTTAA 


TACTGAATCA 


900 




AGCATTTCTC 


TATGCTGTGC 


TGTAACAACA 


ACAATTGGCT 


CGAGCATTTT 


TTCTTGTTCC 


960 
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ATCTTTTTCA TCAAACTACT TATCTCCGAT TCTTCTATTT AGTACCAAAC AATCTATCTC 1080 

CAGCGTCGCC TAACCCTGGT GTGATATATG CTTTGTCATT aGCTTTTCAT CAAGTGCAGC 114 0 

5 AATATAAATA TCTACATCTG GATGTGCTTC ATGCATCTTT TCTACGCCTT CTGGTGCTGC 1200 

AATTAAACAC ATGAAGCGAA TATTTTTAGC GCCACGTTTC TTCAATGAAG TAATAGCTTC 126 0 

AATTGCTGAT GCGCCTGTTG CTAACATAGG ATCAACAACA ATGATTTGTC TTTCAGTAAT 1320 

10 

ATCTTGAGGT AACTTAGCAA AATACTCTAC AGCCTTTAAT GTTTCGGGAT CTCGATATAA 1380 

ACCGATATGT CCAACTCTGG CTGCAGGTAC TAAACTTAAA ATACCATCAG TCATACCTAA 1440 

ACCAGCTCTT AAAATTGGAA CGATAGCTAA TTTTTTACCA GCTAATCGTT TAGCCGTCAT 1500 

15 

TTTAGTTACA GGCGTTTCAA TATCAACATC CTGAAGCTCT AAGTCTCTAG TTACTTCATA 1560 

TGCCATCAAC ATACCAACTT CGTCTACAAG TTCTCTAAAT TCTTTAGTAC CTGTATTTAC 1620 

2Q ATCTCTAATA TAGCTTAGTT TGTGTTGAAT TAATGGATGA TCGAAAACGT GTACTTTACT 1680 

CATAAAAATT ACTCCTATCT TTGTGTATGT TTATTGATAT AGAGGATATT CAGCTGTTAA 1740 

TTTCGCAACG CGTTCTTTAG CTTGTTGTAA TTTTTCTTCA TCTTTACTAT TTTTCAATGC 1800 

25 TAAACTGATG ATTTTTGCAA CTTCCTCAAA AGCTTTTTCA TCAAATCCAC GCGTTGTTGC 1860 

AGCAGGTGTA CCTAAACGTA TACCACTCGT TACAAAAGGT TTTTCTTGAT CGAACGGAAT 1920 

GGTATTTTTG TTACATGTGA TACCAACTGA ATCTAAAGTC TCTTCAGCTT CTTTACCAGT 1980 

30 AAGTCCTATA GACCCTTTTA CATCAACAGC TACTAAGTGA TTATCTGTAC CGCCAGAAAC 204 0 

AATTCTAAAT CCTTCATTAA TTAATGCTTC TGCAAGAACT TTTGCGTTTT TAACCACTTG 2100 

TTGTTGATAC GTTTTGAAAT TATTTTCTAA CGCTTCTCCA AAAGCAACTG CTTTtGCTgC 2160 

35 

AATAACATGC TCAAGAGGTC CACCTTGAAT ACCAGGGAAA ATTGTTTTAT CTATGTCTTT 2220 

TTTATATTCT TCCTTACATA AAATCATACC ACCACGtGGT CCGcGTAATG TTTTGTGTGT 2280 

TGTAGTTGTT ACAAAATCAG CATATTCTAC TGGATTTGGA TGTAAACCTG CCGCTACTAA 2340 

40 

TCCTGCAATA TGTGCCATGT CTACCATTAA CTTAGCGTTT ACTTCATCTG CGATTTCTTT 24 00 

AAACTTTTTG AAGTCAATTG TTCTTGAATA TGCTGATGCT CCTGCCACAA TAAGCTTAGG 24 60 

4S CTTATGCTCT AACGCTAATT TACGAACTTC ATCATAATTG ATTCGTTCTG TGTCTTTATC 2520 

TACTCCATAT TCAACGAAAT TGTAGAATTT ACCACTAAAA TTAACAGGCG CTCCATGTGT 2580 

CAAGTGACCA CCATGACTCA AATTCATACC TAAAACTGTG TCGCCCATTT CTAATGCAAC 264 0 

SO TAAGTAAACA GCCATGTTCG CTTGTGAACC TGAATGTGGT TGAACATTGA CATGTTCAGC 2700 

TCCAAACAAT GCTTTAGCAC GATCAATTGC GATGCTTTCA GTAACATCTA CAAACTCACA 2760 

55 



686 



10 



15 



20 



EP0 786 519 A2 

TTGTGCTTCC ATAACCGCTT CCGATACAAA ATTTTCCGAT GCGATTAACT CTATGTTGCT 2880 

ATTTTGTCTC TGAAATTCTC TCTCGATTGC TTCTGCGATA ACTTTATCTT GCTTGGTGAT 2940 

ATAAGACATA AAATCTCCCC TTCTTTCAAA AAAACTTATT GGTATTTAGC ACGTTCGCCA 3000 

CCAATCTTTT TCGGCCTAGA TGTGGCAATA GTTACAATTG CCTGTCCTAC TTGCTTTACT 3060 

GAGGTCCTTA CAGGTACACA TACATGTTTA ATATGCATGC CTATTAACGT TTGACCAATA 3120 

TCAATTCCAC AAGGAACAGT AATATGTTCG ACCACGATCG GATCCTTCAT ATGCTGAAAA 3180 

GCGTATGTTG CCAAACTCCC TCCAGCATGT ACATCTGGAA CGACGGAAAC TTCTTCCATT 3240 

GTTAATGGAT TATACTGAGA TTTTTCTATT GTTATCGCTC TGTTGATATG TTCACATCCT 3300 

TGAAAAGCAA AAGTAACGCC TGTCTCTTTA CTCACAACAT CTAATGCATT AAAAATAGTT 3360 

TCTGCAACTT CCaTCGAACC GACAGTCCCT ATTTTTTCGC CAATGACTTC CGATGTTGAA 3420 

CATCCAATTA AACATATATC TCCTTTATTA AAAAAGGACA TATCTTTTAA TTCGTCTAAT 34 80 

AACATTGTCA AATCTTTCAT AAAAGCCCAC CCTTCCTAAA AATAAAAAAG GAATATAGCA 3540 

AAGTGCTACA CTCCTCTATT ATAACTTATT TAACTGTTAA CATATACTAA TTATACAGAA 3600 

25 TTCCTACTAG CAAATAATAT CTTTTAATTT TAAAATTAAA CTTACAAGTT CTTCATAGGT 3660 

ATGTACATAC ATTTCTTTTG TTCCACCGTA TGGATCTATA ACTTCTCCTG CTTCTTTtAC 3720 

ATATTCATGC AATGTGAAAA CATGATTTTG CAAACCAAAG TGTGCCTCTA TTAATTCTTT 3780 

30 GTGCGAATAC GACATCGTCA AAATAATATC TGCTTTCAAA TCTGCTTCAG TAAATTGTTG 3840 

CGATAAGGTC GTTTCAGCTA AATGATGTTC TTCAACTAAG TCTTCAACAT AATTCGAAAC 3900 

ACCTTGATTG TTCACAGCGA ATATACCTCT TGATTCAAAT TGATGATTTG GCATAACCTC 3960 

TTTTGCAATA CTTTCCGCTA ATGGGCTACG ACATGTGTTA CCTGTACAAA CGAATAAAAT 4 020 

CTTCATAGTT CACATCCTTT AATAATGTGA TTACCTGCAG CTTTTAACAT GCGATTCATA 4080 

ATTGCTTCTG TATTATCATT CAGCTCAAAG CCGTATATAT ACGCCGCTGA AATATTTTCA 414 0 

TTTTCATCAA GTGAATGTAA CACATCATAA AGATTATGAC TTGCTTGTTT AACATCATTG 4200 

TCATCCTGAC ATAATTGAAT GAATTGCGCT TCACTTGGTA TAAACGCCAC CTTATTACTC 4260 

GGCACAATAA AAGCTATAGA AGACCAATCT TTACCGTCAT TTCCAATTTT GCTCTCAATA 4320 

TCTGTAATAA TTGTAAGTGG TGTATTGGGT GAGTAATGCT TATACTTCAT ACCTGGTGCA 4380 

ATTGGCTGTT CAGTATCATT ATAATCAGCA TGGGCGATAC TATTCGGAAG TATTTCTGTA 4440 

50 ATCATTGCTG CTGTTATAGA ACCAGGTCTT GCAATTTTAT AAGGAAAAGA TGTGCAATCT 4 500 

AAAACCGTAC TTTCTAATCC TTCTTCACTT TGTTCAGCTT GAACAATACC ATCGATACGG 4 560 
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GCACTTGGAG 


CAGCTAGAGG 


TTCATTTATG 


ATTTGTAATA 


ATTGT CTACC 


TACAGAATGG 


1DOU 




CTTGGCATTC 


TAACAGCAAC 


TGATGATAAA 


CCTCCAGAAA 


CTTTTCGACA 


TAGATAGCCT 


4 740 


5 


AGCTTTAACG 


GCAATATAAA 


CGAAATAGGG 


CCCGGCCAGA 


ATGCCTGCAT 


TAACTTTTCT 


4 800 




ACGCGTGGAT 


CCAAAGTATA 


TGTAAAATCT 


TTTAATTGAC 


CTTTACTGTG 


TATATGAACA 


4 660 




ATAAGCGGAT 


TGTCAGATGG 


ACGGCCTTTA 


GCTTCATATA 


TTTTAGCTAC 


AGCTTCTTCA 


4920 


10 


TCTGTCGCAT 


TTGCTGCAAG 


TCCATAAACT 


GTTTCAGTTG 


GTAAACCTAT 


TAAACCACCG 


4980 




TTTAAAACAA 


TGTCTTTTAT 


TTCATTAATT 


TTAGGATATT 


GCTGTAAATC 

ww x \* x x w- 


TTGATTATAT 

X X Wi x ini r%X 


5040 


15 


TCTCTAACAT 


CCCAAATTTT 


AGTATCCAAC 


TTAATCACGC 

X X *^«^ X Vp^wW 


PTTTPT* I * A*I " 1 1 

w X X 1 Vl X AX X 


TATPATAATA 
xnx un x /tja x f\ 




TAAAGCAAAA AGCTATGCAC 


TTAACTAATC 


ATAGCAAAGG 


CATAACTTCT 
w» x nnw x x w x 


A ATT A GG ATT 


5i fin 




TAAATGAGAC 


GATTCGATCG 


T GG CCATTT A 


TATPTTTAAT 
xnx v>l i J. nnx 


AATGTPGATT 


1 X x x 1 o x v«n v 




20 


GAAATTTATT 


TAAAATTATT 


G ATT TAAG TG 

x x x x w 




ATTGT AAP P A 


Alii Uvinnn 






CAACTGGGCT 


GCCTTTTTCC 


ATAACGTGAG 


GT A A ATPTTP 

XnnnX \rl 


AATGATTGAT 
nn X Vzn X X VJn x 


T P A T A & A T Af2 
i X Ann 1 nVJ 






CATATCCATG 


GTTATCTGCA 


AACAATGCCT 


GATGTGGTTC 


GAATCTGGTA 

wnn X V,* X *w\J X n 


APPGTTGGAG 
nW x 1 WAV 


OH U L/ 


25 


ACATCGTAAC 


CATATCTTTT 


TCATCTATAT 

X V*>*» X ^ X *» X ** X 


A TGGTGG A TT 


AG ATA TP A AG 
nun X n X vnnu 


PPP.TTP A A PT 






TGATACCTTC 


ATTAATTAAG 


GGCTTTAATG 


CATCCCCTGT 


TAAAAATTGT 


ATTTGTG A TT 
n xxx xj x urn x 


R^ft 




GATGCTTCTC 


AGCATTATTA 


CGAGCCATAT 


TCATTG PJTTC 


AAGTGAAATA 


TPAGTAP.PAA 
1 \— n w 1 nO v — nn 




30 


TAACATTTAA 


ATCCGGCTTT 


TCACATTTCA 


AAGTAATTGC 


AAGTACACCA 


PT A rPPfSTTP 


ft 




CGATATCTAC 


GATTGTTGCA 


TCATCTTCTA 


ACTGTTGTAA 


GAAATGCAAC 


ATT1PTTPTT 
n x x n w 1 1L1 i 


^"7n ft 

J / uu 




CAGTTTCAGG 


TCTTGGTATC 


AAACAATTTG 


AGTTTACATC 


AAACGTTCTA 


PPATAAAATG 
wn Innnn 1 w 


^"7 C ft 


35 


AGGCAAAGCC 


AACTATATAC 


TGTATAGGCT 


CTCCTAATAA 


CATACGTTGT 


AATGCTAAGT 


5820 




CGAACTTCAT AATCATCGCT TTCGGCATAT CATCATGCAT GTGGACTACA 


AAGTCCGTAC 


5880 


40 


GCGTCCATTG 


AAATACATCT 


AACATTAACC 


ATTCAGCTCG 


TGTTTGTTCA 


AACCCTTTTT 

n*^\» w v* x x x x x 


5940 


GTTGTGTTAA 


ATGAATTGCT 


TCATCTAACT 


TTTCTTTATA 


ATTCACCATT 


ATTAAGTTCT 


6000 




TTCAATTTAT 


CTGTCTGCTC 


TGATAAAGTC 


AGTGCATCTA 


TAATTTCTTC 


TAAATGGCCT 


6060 


45 


TCCATAATTT 


GCCCTAATTT 


TTGAAGCGTT 


AGACCTATAC 


GATGGTCTGT 


TACACGGCTT 


6120 




TGTGGATAAT 


TATAAGTTCG 


AATACGTTCT 


GAACGATCAC 


CAGTACCGAC 


TGCTGATTTA 


6180 




CGTTGTGACG 


CATACTTTTG 


TTGTTCTTCT 


TGAACTTTCA 


TATCGTATAA 


ACGTGCTTTT 


6240 


50 


AACACTTTCA 


TTGCTTTTTC 


ACGGTTTTGA 


ATTTGAGACT 


TCTcAGAAGA 


TGTTGCAATG 


6300 




ACACCAGTTG 


GTAAATGGGT 


AATACGTACT 


GCAGAGTCAG 


TTGTGTTTAC 


GTGCTGACCA 


6360 
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ACATCTTCAA CTTCTGGTAA AACTGCCACT GTAGCTGTTG AAGTATGAAT ACGTCCACCT 6480 

GATTCTGTTT CAGGCACACG TTGAACGCGG TGCGCACCAT TTTCAAATTT CAATTTACTA 6540 

5 TACGCGCCAT TACCAGAAAC TGAGAAACTA ATTTCTTTGT AACCACCATG GTCACTTTCA 6600 

GACGCTTCTA CTATTTCAGT TTTGAATCCT TGTGATTCAG CATACTTTGA ATACATACGC 6660 

ATTAAATCAC CAGCAAAAAT CGCAGCCTCA TCACCACCTG CTGCTGCTCT TATTTCTACA 6720 

10 

ATAACGTCTT TGTCATCATT AGGATCTTTA GGAATCAATA ATATTTTAAG CTCTTCTTCA 6780 

AGATTTGGAA GTTCAGCTTT AATACCATTA CTCTCCTCTT TTAACATTTC TACTTCTTCT 6840 

TTATCATCAG TCTCACTTAA CATTTCTTCA ATATCAGCTA ATTCTTCTTT TTTAGCTTTA 6900 

15 

TAGTTACGAT AAACATCTAC AGTTTTTTGT AAATCAGCTT GCTCTTTAGA ATATTTACGT 6960 

AATTTATCTG AATCATTTAC AACATCTGGG TCACTTAACA GTTCATTTAA CTGTTCGTAT 7020 

20 CTTTCTTCTA CAATATCTAA TTGATCAAAC ACTTATAATT CCTCCTTATT ATTATCACTA 7080 

GGTGCTACGA TATGGTGCGC GCGACAACGT GGCTCATAAC TTTCATTGGC ACCTACTAAG 7140 

ATAATCGGAT CATCGATTTT AGCTGGTTTA CCATTTATTA ATCGTTGCGT TCTACTAGAT 7200 

25 GAAGAACCAC AAACAGCACA AACTGCTTGA AGTTTCGTTA CTTGTTCACT GACAGCCATC 7260 

AATTTAGGCA TTGGTTCGAA CGGTTCGCCC CTAAAATCCA TATCTAATCC AGCAACAATA 7320 

ACACGGTGTC CATCTGCTGA TAGTTTTTCT ACTATACTTA CAATTTCATC GTCAAAAAAT 73 80 

30 TGCAcTTCGT CTATTCCTAT AACATCAACA TTAGTTAAGT CGTGCGTCAT AATTTCACTT 7440 

GCTTTAGAAA TATTAATCGC TTCAATGGCA TTACCATTAT GAGAGACCAC TTTTTCTTTA 7500 

TGATATCGAT CATCAATCGC CGGTTTAAAT ACAACGACTT TTTGTTTAGC GTATATACCC 7560 

35 

CTTCTTAGAC GTCTTATTAG TTCTTCGGAT TTACCGCTAA ACATACTACC TGTAATACAT 7620 

TCTATCCAAC CGGAATGGTA AGTTTCATAC ATTGAGAGTn CCACCTTTTT CAAAACATAA 7680 

TCGCTTTATT ATATCATATT TCAAATATTC ATAAATGTCT TTnTCATAAT TATATCGATA 7740 

40 

TTGTACATGA ACAATTATTT TA 7762 
(2) INFORMATION FOR SEQ ID NO: 125: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2583 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

55 
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TAAAAAAATT ATTATCAATG ATGAACTAGA ATTGACTGAA 


TTCCACCAAG 


AACTTACTTA 


120 




TATTTTAGAC AACATAnAAG GGAATAATAA TTATGGTAAG 


GAATTTGTTG 


CAACCGTTGA 


180 


5 


AGAAACATTC GACATTGAAT AaAGCGGGGT GgaAGCACTA TGAATCAATG GGATCAGTTC 


240 




TTAACACCTT ATAAGCAAGC GGTTGATGAG TTGAAAGkGA AcTTaAAGGC ATGCGCAAAC 


300 




AATATGAAGT TGGTGAACAA GCGTCGCCAA TAGAATTTGT 


TACTGGTCGT 


GTTAAACCAA 


360 


10 


TCGCTAGTAT TATAGATAAG GCAAACAAAC GACAAATACC ATTTGATAGG TTAAGAGAAG 


420 




AAATGTACGA TATCGCTGGT TTAAGAATGA TGTGCCAATT 


TGTTGAAGAT 


ATTGATGTTG 


480 


15 


TCGTCAATAT TTTAAGACAA AGAmAAGATT TTAAAGTAAT 


TGAAGAACGA 


GATTATATTC 


540 


GTAACACTAA AGAAAGTGGT TACCGCTCGT ATCATGTCAT 


TATTGAATAT 


CCAATTGAAA 


600 




CATTACAAGG CCAAAAATTT ATATTGGCTG AGATTCAGAT 


TCGTACATTA 


GCAATGAATT 


660 


20 


TCTGGGCAAC GATTGAACAT ACTTTACGAT ATAAATATGA 


TGGTGCTTAT 


CCGGATGAAA 


720 




TTCAACATCG TTTGGAAAGA GCGGCAGAAG CAGCGTATTT 


ACTTGATGAA 


GAGATGTCTG 


780 




AAATTAAAGA TGAAATTCAG GAAGCTCAAA AATATTACAC 


GCAAAAACGT 


TCTAAAAAAC 


84 0 


25 


ATGAAAATGA TTAACGAGGT GTTATAAATC ATGCGTTATA 


CAATTTTAAC 


TAAAGGTGAC 


900 




TCCAAGTCTA ATGCCTTAAA GCATAAAATG ATGAACTATA 


TGAAAGrTTT 


TcGCATGaTT 


960 




GaGGATrGTG AAAaTCCTGA AATTGTTATT yCAGTTGGTG GTGACGGTAC ATTACTACAA 


1020 


30 


GCATTCCATC AG TAT AG CCA CATGTTATCA AAAGTGGCAT 


TTGTTGGAGT 


TCATACAGGT 


1080 




CATTTAGGAT TTTATGCGGA TTGGTTACCT CATGAAGTTG 


AAAAATTAAT 


CATCGAAATT 


1140 




AATAATTCAG AGTTTCAGGT CATTGAATAT CCATTGCTTG 


AAATTATTAT 


GAGATACAAC 


1200 


35 


GACAACGGCT ATGAAACAAG GTATTTAGCA TTAAATGAAG CAACGATGAA AACTGAAAAT 


1260 




GG CTCAACAC TTGTTGTGGA TGTTAACTTA AGAGGGAAAC ACTTTGAGCG ATTTAGAGGC 


1320 


40 


GATGGATTAT GTGTATCAAC ACCTTCGGGT TCAACGGCTT 


ATAACAAAGC 


GCTAGGTGGC 


1380 


GCACTGATAC ATCCTTCACT TGAAGCAATG CAAATTACAG 


AAATTGCCTC 


GATAAATAAT 


1440 




CGTGTGTTTA GAACGGTAGG ATCACCACTT GTATTACCAA 


AGCATCATAC 


ATGTTTAATA 


1500 


45 


TCACCAGTTA ATCATGATAC CATTAGAATG ACGATAGATC 


ATGTTAGTAT 


CAAACATAAA 


1560 




AATGTTAATT CAATACAATA CCGTGTAGCA AATGAAAAAG 


TGAGGTTTGC 


ACGTTTTAGA 


1620 




CCATTCCCAT TCTGGAAACG TGTGCACGAT TCTTTCATAT 


CAAGTGATGA AGAACGATGA 


1680 


50 


AATTTAAGTA TCATATATCA CAACAAGAAA CTGTTAAAAC 


TTTTTTAGCA 


CGACATGATT 


1740 




TTTCTAAGAA GACAGTGAGC GCCATTAAAA ATAATGGCGC 


TTTAATTGTT AATGATGAAC 


1800 
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AAATACCGAG TGTTAATTTA ATACCTTATG CTCGTAAGCT AGAAGTATTG TATGAAGATG 1920 

CTTTTATCAT CATAGTTACT AAACCAAACA ATCAAAATTG TACGCCTTCG AGAGAACATC 1980 

CTCATGAAAG TTTAATCGAA CAAGTACTAT ATCATTGTCA GGAACATGGT GAAAATATTA 2040 

ACCCACATAT TGTTACGCGT CTAGATCGTA ATACAACTGG TATTGTGATA TTCGCTAAAT 2100 

ATGGACATAT CCATCATTTA TTTTCTAAAG TAAACTTGAA AAAAATATAT ACTTGCCTTG 2160 

TATATGGTAA AACCCATACA TCTGGTATTA TTGAAGCTAA TATTAGACGG TCAAAGGATA 2220 

GGATTATAAC TAGAGAAGTT GCCTCGGATG GTAAATACGC TAAAACATCT TATGAAGTAA 22 80 

TAAATCAGAA TGATAAATAC AGTTTATGCA AAGTTCATTT GCATACGGGA CGTACACATC 2340 

AAATTCGTGT ACATTTTCAA CATATTGGGC ATCCAATTGT GGGAGATTCT TTGTATGATG 2400 

GTTTTCATGA CAAAATTCAT GGTCAAGTAC TGCAATGTAC GCAAATATAT TTTGTTCATC 2460 

20 CAATCAATAA GAACAATATT TATATTACAA TTGATTATAA GCAATTACTT AAATTATnCA 2520 

ATCAACTCTA ATnCACACAG GGGGTGTAAG TATGTCAATG AnCACAGATG AAAAAGAGCG 2580 
TGT 

25 (2) INFORMATION FOR SEQ ID NO: 126: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1818 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
30 (D) TOPOLOGY: linear 



2583 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

35 



40 



45 



ATCAAGTGAT 


ACATTTAACT 


GGTAAAGGAT 


TAAnAGATGC 


TCAAGTTAAA 


AAATCnGGAT 


60 


ATATACAATA 


TGAATTTGTT 


AAAGAGGATT 


TnACAGATTT 


ATTnGCAATT 


ACGGATACAG 


120 


TAATAAGTAG 


AGCTGGATCA 


AATGCGATTT 


ATGAGTTCTT 


AACATTACGT 


ATACCAATGT 


180 


TATTAGTACC 


ATTAGGTTTA 


GATCAATCCC 


GAGGCGACCA 


AATTGACAAT 


GCAAATCATT 


240 


TTGCTGATAA 


AGGATATGCT 


AAAGCGATTG 


ATGAAGAACA 


ATTAACAGCA 


CAAATTTTAT 


300 


TACAAGAACT 


AAATGAAATG 


GAACAGGAAA 


GAACTCGAAT 


TATCAATAAT ATGAAATCGT 


360 


ATGAACAAAG 


TTATACGAAA 


GAAGCTTTAT 


TTGATAAGAT 


GATTAAAGAC 


GCATTGAATT 


-420 


AATGGGGGGT 


AATGCTTTAT 


GAGTCAATGG 


AAACGTATCT 


CTTTGCTCAT 


CGTTTTTACA 


480 


TTGGTTTTTG 


GAATTATCGC 


GTTTTTCCAC 


GAATCAAGAC 


TTGGGAAATG 


GATTGATAAT 


54 0 


GAAGTTTATG 


AGTTTGTATA 


TTCATCAGAG 


AGCTTTATTA 


CGACATCTAT 


CATGCTTGGG 


600 
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20 



25 



30 



35 



CTCATGTTAA 


AGCGCCACAA 


AATTGAAGCA 


TTATTTTTTG 


CATTAACAAT 


GGCATTATCT 


720 


GG AATTTTGA 


ATCCAGCATT 


AAAAAATATA 


TTCGATAGAG 


AAAGACCTAC 


ATTGCTGCGT 


780 


TTAATTGATA 


TAACAGGATT 


TAGTTTTCCT 


AGCGGTCATG 


CTATGGGATC 


AACTGCATAT 


840 


TTTGGAAGTG 


GTATCTATCT 


ATTAAATCGA 


TTAAATCAAG 


GTAATTCAAA 


AGGTATTCTT 


900 


ATAGGGTTAT 


GTGCAGCTAT 


GATTTTATTG 


ATTTCCATAT 


CACGTGTATA 


TCTAGGTGTA 


960 


CATTATCCAA 


CAGATATTAT 


TGCCGGCATT 


ATTGGTGGAT 


TATTTTGcAT 


TATTTTATCA 


1020 


ACGTTATTAC 


TTAGAAATAA 


ATTAATAAAT 


TAAATAGTAA 


AAAAACAAAA 


GCAGTAAACC 


1080 


TAAAGTGTCG 


TAAGGGTTTA 


CTGCTTTTAT 


AAAACGTTGT 


TATAACGTAT 


ATTGTCTTTT 


1140 


ACGGGCATAT 


AAnAGGGGAA 


TATTTGAnAA 


TGACCAATCC AACAAGAACG 


AAACGTTGTG 


1200 


GGGGGGATGT 


TCTATGTGGT 


ATTGATAATC 


ATTTTCAACT 


ACTATTATAC 


ATTAGTGAGA 


1260 


ATCATTGTCA 


ATTAGAAACT 


AAAACTTTTT 


TTGAATATTT 


TTTAAGAATA 


GTAAATAAAA 


1320 


CGCATGATTA 


CGCTATTTTA 


GAAAATAAAA 


AAATTTGTAT 


TTCTCATTAG 


AATTAGAATA 


1380 


TTTAAAAGTG 


ATGAGGTTTA 


AACATTATAT 


TGTTTACATA 


CTCCTTTTGA 


ATTCATACAT 


1440 


TATGAAATGT 


tACTTCCAAG 


TTCAAAATCG 


CACATTGAAA 


TGATGTGTGA 


AATGTTTAAA 


1500 


CTACGGTCAT 


tTTGTGmAAA 


TAAAGrTAAT 


AACTATTCAT 


TTTACAATAG 


TGAAAAGTCA 


1560 


GTATATGACA 


ACAATTAATA 


TTGCGGTAAG 


GCCTTGTGTT 


ACAGTATTCT 


ATATTTAAGT 


1620 


ACTGCAATCA 


GAATTAACAG 


AATGCCATTA 


ACTGATTATT 


AAATATTTGA 


GTTAATAAAT 


1680 


AATTAATGAT 


TGTAGCTTGA 


AAAATTTAAA 


ACATGGTTAT 


TGATTTGTGA 


TAAAATTTAA 


1740 


ACGTAAACAA 


ACTAATTTAA 


AAAGCAACTA 


TTGTATAGAA 


AAATACAAAA 


TTTAAAATAT 


1B0O 


ATTACCTTAT 


TAGAAAAA 










1818 



£2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 12658 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
TGTTTAAACA ATAGGGGGAA TCTTATGATT GAAAAATTAG TAACCTTTTT AAATGAGGTT 60 
50 GTTTGGAGTA AGCCATTAGT TTATGGTTTG CTAATTACTG GTGTGCTATT TACATTGCGT 120 

ATg CG ATTTT TTCAAGTTAG ACATTTTAAA GAAATGATTC GATTAATGTT TCAAGGAGAG 18 0 

55 
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GGTACAGGTA ATATTGTCGG TGTATCTACT GCAATATTTA TAGGAGGACC TGGTGCAGTA 300 

TTTTGGATGT GGATTACTGC GTTTTTAGGT GCAAGTAGTG CTTTTATTGA ATCTACACTT 360 
GGTCAAATAT TCAAGAGAGT TGAAAATAAT GAATACCGTG GTGGACCAGC GTATTATATT . 420 

GAATATGGTA TTGGTGGTAA ATTTGGTAAA ATTTACGGAA TTATCTTTGC TATTGTTACG 480 

ATTATCTCAG TAGGTCTATT GCTTCCTGGT GTGCAATCTA ACGCTATAGC AAGTTCTATG 540 

CATAATGCGA TTCATGTTCC ACAATGGTTA ATGGGTGGTA TTGTTGTAGT TATTTTGGGA 600 

TTAATTATTT TTGGTGGTGT ACGTATTATT GCCAATGTTG CAACAGCCGT TGTACCATTT 660 

ATGGCAATTA TTTACATACT GATGGCTGTC ATTATCATTT GTATCAATAT ACAAGAAGTG 720 

CCAGCGTTAT TTGCATTAAT TTTCAAATCA GCATTTGGAT TACAATCTGC TTTTGGTGGT 780 

ATCGTTGGCG CAATGATAGA GATTGGTGTT AAACGTGGAT TATATTCAAA TGAGGCTGGT 84 0 

20 CAAGGTACAG GTCCACACGC AGCAGCGGCa gcAGaAGTAT CACATCCAAG TAAACAAGGT 900 

CTAGTACAAG CATTTTCAGT TTATATTGAT ACATTATTTG TATGTACTGC AACTGCTCTG 960 

ATTATACTTA TTTCTGGTAC ATATAATGTG ACTGATGGTA CGGTTAATGC GAATGGCACA 1020 

25 CCGCATTTAA TTAAAGATGG CGGTATTTAT GTTgAAAATG CAACAGGTAA AGATTATTCA 1080 

GGTACTGCGA TGTATGCACA AGCCGGCATt GATAAAGCGT TCCATGGCAG TGGTTATCAA 1140 

TTTGATCCTA CTTTCTCTGG CGTAGgTTCG TACTTTATTG cATTTGCTTT ATTCTTCTTT 1200 

GCATTTACTA CAATTTTGTC GTACTACTAC ATTACAGAAA CAAATGTTGC TTATTTAACG 1260 

CGTAATCAAA ATAATCAAGT TTCATCGATA TTTATTAATA TTGCTCGTGT GATTATTTTG 1320 

TTCGCTACAT TTTACGGTGC AGTTAAAACA GCTGATGTAG CATGGGCATT CGGTGATTTA 1380 

GGTGTAGGTC TAATGGCTTG GTTAAATATC ATTGCGATTT GGATTTTACA TAAGCCTGCC 1440 

GTAAATGCTT TAAAAGATTA TGAAATTCAA AAGAAACGTT TAGGCAACGG TTATAATGCA 1500 

GTTTATCAAC CTGATCCGAA TAAATTACCT AATGCTGTCT TTTGGTTGAA GACGTATCCA 1560 

GAACGTTTAA AACAAGCACG TGCCAAAAAG TAATCTACTT TTGTTTATAG TATATGTAGT 1620 

GATCATTTGA TAAAAAAGAA AAGTATTGAG AATTTTAGGt GCTCAGAAAT TTGAATTTTA 1680 

45 AAAATATAGT GTCTCTTGGT ACAATAACAA TACAACTACT AGGGGCACTT TTTTATGTCA 174 0 

GAATTTAAAA CTGGTAAGAT TAATAAACAT GTTTTATATA GTAATATTTT AAATAGAGAT 1800 

GTCACGTTAA GTATTTATTT ACCAGAATCT TATAATCAAC TTGTTAAATA TAATGTCATT 1860 

SO CTTTGCTTTG ACGGATTAGA TTTTTTACGT TTCGGGAGAA TACAACGTAC ATATGAATCG 1920 

TTAATCAAAG AAGCGCGTAT TGATGATGCG ATCATTGTTG GATTCCATTA TGAAGACGTT 1980 



55 



30 



35 



40 



693 



EP 0 786 519 A2 



10 



15 



20 



GTCGGTAAAG AAATATTGCC ATTTATTGAC TCGACGTTTT CTACACTGAA AGTAGGTAAT 2100 

GCAAGGTTAT TAGTAGGGGA TAGTTTAGCG GGTAGTATTG CCTTATTAAC GGCGTTGACC 2160 

TATCCAACGA TTTTTAGTCG TGTAGCAATG TTAAGTCCAC ATTCAGATGA AAAAGTATTA 2220 

GATAAGCTAA ATCAATGTGC AAATAAAGAA CAATTGACAA TTTGGCATGT CATTGGTCTA 2280 

GATGAAAAAG ATTTTACTTT ACCAACAAAT GGTAAGCGTG CCGATTTCTT AACACCGAAT 2340 

AGAGAATTAG CTGAACAAAT TAAGAAATAT AATATAACTT ATTATTACGA TGAATTTGAT 2400 

GGTGGTCACC AATGGAAAGA TTGGAAACCA TTGCTGTCAG ATATATTATT GTATTTTTTA 2460 

AGTAAAAACA CAGATGATCA ACTTTATGAA TAATTTACAT TAGTAGATTT AGTATGAATT 2520 

GTCTTCATAT AGTCTGGTCT ATAATATAAT TTATAAAAGA TTTTACTGTT TAATTTAATT 2580 

TAAATTTGAC GAAATTGCAA AAGATGTATA ATGAATTATT TTTAATGTAA CGGTTTTCAA -2640 

AGAAATTTGA TATAATAGCA ATAGGTTAAA CAAAGGAGGA ATTCAGATGA TTTTAGGATT 2700 

AGCATTAATT CCATCAAAGT CATTTCAAGA AGCGGTGGAT TCTTACCGTA AAAGATATGA 2760 

TAAACAGTAT TCACGAATTA AACCACATGT GACAATTAAA GCGCCATTTG AAATTAAAGA 2820 

25 TGGTGATTTA GATTCTGTCA TTGAACAGGT TAGAGCTCGT ATTAATGGTA TACCAGCAGT 2880 

AGAAGTTCAT GCTACAAAAG CTTCTAGCTT CAAACCAACG AACAATGTGA TTTACTTTAA 294 0 

AGTTGCGAAG ACGGACGACT TAGAAGAATT GTTTAATCGC TTTAATGGAG AAGATTTCTA 3000 

30 TGGAGAAGCT GAACATGTTT TTGTGCCACA CTTTACAATA GCACAAGGAC TATCTAGCCA 3060 

AGAATTCGAA GATATTTTTG GTCaAGTAGC ATTAGCTGGG GTAGACCaTA AAGAAATTAT 3120 

CGATGAATTA ACTTTGTTAC GTTTTGACGA TGACGAAGAT AAATGGAAAG TTATTGAAAC 3180 

GTTTAAATTA GCTTAAGTAA CATAATAGTA TTGTTAATCG TAGTATGTTT GAATTAATAA 324 0 

GAAAATGGTC ATTTTTATTG AATGTAATAA AAATGACCAT TTTCTTTATT TTAAAATACG 3300 

TTTTAACCTT ACTTAGCTTT TTCTCTATTT ACTATAAAGT rGCTTCCATA AAATACAGCT 3360 

AAGACTAAAA AGATTAATGC CGAGAAATAA AATGTATTGT TTAAATTGTT GGTAAATTGT 3420 

GTAATTAATC CGCCAAATAA TGGCCCTATC ATTGAGCCGA ATCCTTGGAT ACTATTAAAA 34 80 

ACACCCCAAG TTTCTTCTTG TTCATCTGAT TTGATAAATC GTGCCATAAA GGTATTCCAT 3540 

GCTGGTAATA AGATGCCATA CATTAGACCG ATAGCTAAAG CGATAATCCA CAAGATGTGA 3600 

ATATTAACAA TCATAGATAG AGTAAAAATT AATATCATGT ATAAAATAAA TCCGCTTAGA 3660 

ATAACACCAT ACATAAAGTT TCTGCTGCGG TTATCTATTA GTTTCGATAA AAATAGCATC 3720 

GAAACTGCAC AGCCGATACC ACCAATAATG ATTGCAACAG TATATTCAAT TGTGCTTACG 3780 
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TGTAAAAGAA TACCAGGGAA CaACAATAAA 
AATTGAGCTT TAACTGGACG AGTATTATAA 
5 AATATCCATG CAATTAAAAC GACTAAAGAC 

TTGATAAGTA GATTCATAAA AACCATACCT 
ACATAGCCCA TTTGTTTGCC ACGTTTATCT 

10 

ATAGGACTAA CTGCAATACC GAGCATCATA 
GGAAACCAAA TAACTAAAAA TAAACTTGTA 
ATTTTTGTGC CGAATTTTTT CAGTAAAAAT 

15 

ATAAAATGTA TTGAAAATGC TAGAGACGTT 
AAGAAATTAA TATAGCTTAG GATATACATG 

20 ATAAGCaTTA AAATGAAATT TTTATGATTA 

TAAAGGAACC TTTCCATAAA TCTCTTGTGG 
GTCTCGACAT ATTGTCTGTG TAGCATACTT 

25 GTTAGTTAAT TGCTCATTAC CGTTAGTTAA 

AGTATCAGCG ATTTTACCAA AACCTTTTTC 
ACCAGGTGCA GGATTTAGGA AAATCATTGG 

30 GATACCACCA GGTTTCGTAA TCATAAGTTG 

GGTATAACCT AGAATCAATA CATTCTCGTT 
TAGCTCTTTG CTCTTACCAC AAATCATAAC 

35 

ATCAGTAATC ATCGTGTCAA AACCTTTAGA 
AGTITCCTTA TCTGGATCTA AGTTGTTGTC 
TTCAAATTTG TTATCAATAG GAATACCTGT 

40 

GTCTATGAAG TCTTGTTTCG TTTCTTTTGT 
AATCCAGTTT TTATGTAAGC GATAGTCTGT 
AAATTGCTCA GTTAGTACCG ACATAACTGG 

45 

CTTTTCTTTT ATCAATAAAT TAATTAACTT 
GTCTAGTTTA TCTGGGCGGC TGTAATAAAA 
SO GCTATTGATA TACCATTTTT TACAAATAGA 

ATCGTGCTCA ATGACGCTTA AATGGTCTAG 

55 



TGGcGCTTTG TCACATCAAC AATTTGTCTC 3900 

TTTGTTAACT TTACATCGAC AAAATAATAT 3960 

ATCATGAAGG CAAAGCGTGT TGGGTGCACT 4020 

ACCAATAGGC CTAACAACCA TGAAAAATAA 4080 

TCTTCAACAC TGGATAACAT AATGACCCAA 4140 

GCACTAAATA TGATTACAAA AGGTGATGCT 4200 

AATGCTAAAA TAAATCCAGT CGTTAAAACG 4260 

CCTATAACAA AGTTTGTAGA TGCATCAGCA 4320 

ATTGCTACAG CAATGGATGT AACTGTTGGC 4380 

CCTCTCGCAA ATTCCATTAA AAATAAGATA 4440 

GCGTAATTAT TTAACGAAGA ATCTTGCATA 4500 

TTGTGATGAA TGACCGATTA AATCAAGTAA 4560 

AATTTTATCT TGTTCCATTG TACTAATCAT 4620 

ACTTGCTACA ATTTTTATTG CTTCTTCTGG 4680 

TTCAAAGTAA AGGGCATTTT CAAGCTCTTG 474 0 

AATACAACGG GCGAAACCTT CAGTTATTGT 4800 

ACTTGATGCC ATCCATTCAT TCATGTGTTT 4 860 

AGATTTAAAC TTAGCTGTTA AAGAACGCTT 4 920 

TACTTGTGCA TTTGCaCTTT tCGCTAATAT 4 980 

TACACCAAAT GCACCAGCTG aCATTAAAAT 5040 

TATTAACCAC TGCTTTTGAT TAATAGGCGT 5100 

CaCTTTAACT GTTGAAGGAT CAATACCTAC 5160 

TGCCACATAA TATCTTGTTG AATACGGCGT 5220 

CATCACTGTA GCAACTGGAA TATTAATGTT 5280 

TGTAGGAAAC GTTAATAATA TTAAATCTGG 5340 

ATTAAGTCCA TAGTATTTGT AAAAACATTT 5400 

CCCTTTGTAC ATATTTCTAA AATATTTAAA 54 60 

AGTCAAAATT GGATGAGCTT CCATAAATAA 5520 

ATTCATATCA TTAAGTTGAT TAACGATACT 5580 
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TTGAGTAACC ATTAATAGCC ACCCTCCGTT 
TTTTACGGCA TTATAAAAGA AATAAAGACG 
5 CTATAGATGA ATTGATACAA AATAAAACGT 

TTTATTTGTA TATTTAAAAT TATCCAGTAT 
TGATATTATA CCATGTTACA AGATGGTTTT 

10 

CTAATTCATA ATACCGTATG TTTTATTTTT 
AGGATATAAC AGTGAAATTA TAAATTTATT 
TTGTAGAAAA AGGAGCGGTT CAGTTTGGAT 

15 

. AAGCGTGTAT TGGGTTCTTT AGAACAACAA 
GCGAGAGAAG GTAGCATTTT TGTCGCTTCA 

2Q TGTCAAAATG TAGCTGATCA AGGGTGTAAG 

CCAGCTAACG TAACACAAGT GGTTGTGCCG 
CACACATTAT ATGATTATCC GAGTCATCAG 

25 GGTAAAACTT CTATTGCGAC GATGATTCAT 

GCATATTTAG GAACTAATGG TTTCCAAATT 
ACACCAGAAA CAGTTTCTTT AACTAAGAAA 

30 TCTATGACAT TAGAAGTATC AAGCCATGGC 

TTTGACGTTG CAATATTTTC AAATTTAACA 
GAAGCATACG GACACGCGAA GTCTTTATTG 

35 

GAAAAGTATG TCGTGTTAAA CAATGACGAT 
CCTTATGAAG TATTTAGTTA TGGAATTGAT 
CAAGAATCTT TACAAGGTGT CAGCTTTGAT 

40 

AAATCGCCTT ATGTTGGTAA GTTTAATATT 
TGGAGTAAAG GTACATCTTT AGAAACGATT 
GAAGGGCGAT TAGAAGTTTT AGATCCTTCG 

45 

CATACAGCTG ATGGTATGAA CAAATTAATC 
TTGATATTTT TAGTTGGTAT GGCAGGCGAA 
so CGAGTTGCCT GTCGTGCAGA TTATGTCATT 

CCGAAAATGT TAACGGCAGA ATTAGCCAAA 

55 



AGTTTGAAAA 


TTTTATTTAA 


GTGTAACTTA 


5700 


CAAAGTCGTT 


ACATTTATAG 


CAATTTTAAT 


5760 


TATTTTATAA 


AGCAATTTAT 


TGTTCTATGT 


5820 


ACAATTATAG 


CATATTTTTG 


GAAACAATTA 


5880 


AATAATTTAA GATGAGCCAT AATTGTAAAA 


5940 


AATAGTAGAA 


ATTAGAAAAT 


GCTGATTAGT 


6000 


AACATCAACA 


AAACGTGTAT 


AATAAACATA 


6060 


GCAAGTACGT 


TGTTTAAGAA 


AGTAAAAGTA 


6120 


ATAGATGATA 


TCACTACTGA 


TTCACGTACA 


6180 


GTTGGATATA 


CTGTAGACAG 


TCATAAGTTC 


6240 


TTGGTAGTGG 


TCAATAAAGA 


ACAATCATTA 


6300 


GACACATTAA 


GAGTAGCTAG 


TATTCTAGCA 


6360 


TTAGTGACAT 


TTGGTGTAaC 


GGGTACAAAT 


6420 


TTAATTCAAA 


GAAAGTTACA 


AAAAAATAGT 


6480 


AATGAAACAA 


AGACAAAAGG 


TGCAAATACG 


6540 


ATTAAAGAAG 


CAGTTGATGC 


AGGCGCTGAA 


6600 


TTAGTATTAG 


GACGACTGCG 


AGGCGTTGAA 


6660 


CAAGACCATT 


TAGATTTTCA 


TGGCACAATG 


6720 


TTTAGTCAAT 


TAGGTGAAGA 


TTTGTCGAAA 


6780 


TCATTTTCTG 


AGTATTTAAG 


AACAGTGACG 


6840 


GAGGAAGCCC 


AATTTATGGC 


TAAAAATATT 


6900 


TTTGTAACGC 


CTTTTGGAAC 


TTACCCAGTA 


6960 


TCTAATATTA 


TGGCGGCAAT 


GATTGCGGTG 


7020 


ATTAAAGCTG 


TTGAAAATTT 


AGAACCTGTT 


7080 


TTACCTATTG 


ATTTAATTAT 


CGATTATGCA 


7140 


GATGCAGTAC 


AGCCTTTTGT 


AAAGCAAAAG 


7200 


CGTGATTTAA 


CTAAAACGCC 


TGAAATGGGG 


7260 


TTCACACCGG 


ATAATCCGGC 


AAATGATGAC 


7320 


GGTGCAACAC 


ATCAAAACTA 


TATTGAATTT 


7380 
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GTTTTAGCAT 


CAAAAGGAAG 


AGAACCATAT 


CAAATCATGC 


CAGGGCATAT 


TAAGGTGCCA 


7500 




CATCGAGATG 


ATTTAATTGG 


CCTTGAAGCA GCTTACAAAA AGTTCGGTGG 


TGGCCCTGTT 


7560 


c 
o 


GATTAATAAA AGATTTATTG 


A i. vxAAvjKj 1 AA 


AACTATTGAT 


GTTTATTTAT 


TCGAAGCATT 


7620 




AAATAACCAG 


A T A A TC* A TTrt 




TTGGTTTTGG 


TCATATCAGA 


TGGCAATGAC 


7680 




ATT ARATTC A A 


VJ AnAl. Hull 


1 1 \j AAvj L-AA 1 


ACTCATGCAA 


TTGTTTGTTT 


TTAAAGAAGA 


7740 


10 




flTA 7A TOf 1 TATTY" 1 




AACAGATTGG 


ATAGAAACAT 


ATAAAAAGGA 


7800 




U/vvlunV. x AA 


1 VjAAVw 1 1 AAA 


(jUAAvjAAGTT 


GAGTCTAGAA AGACTTTTGC 


GATTATTTCA 


7860 


15 




LLAvjGGAAAAC 


K IV P^TT T\ IV /*»rp 

AACGTTAACT 


GAAAAACTAT 


TGTACTTCAG 


TGGTGCTATT 


7920 




GTACAGTTAA 


AGGGAAGAAG 


ACTGGTAAAT 


TTGCGACAAG 


TGACTGGATG 


7980 




AAAG1 ISAAC 


AAGAG CGTGG 


TATTTCTGTA 


ACTAGTTCAG 


TAATGCAATT 


TGATTACGAT 


8040 


20 


^ ?V T"T1\ TJi IV IV IV 

GA I r A I AAAA 


TCAATATCTT 


AGATACACCA 


GGACATGAAG 


ACTTTTCAGA 


AGATACGTAT 


8100 




AGAACATTAA 


TGGCAGTTGA 


CAGTGCTGTC 


ATGGTCATAG 


ACTGTGCAAA 


AGGTATTGAA 


8160 




LCAuAAACAT 


TGAAGTTATT 


TAAAGTTTGT 


AAAATGCGTG 


GTATTCCAAT 


CTTTACATTC 


8220 


25 


ATTAATAAAT 


TAGACCGAGT 


AGGTAAAGAA 


CCATTTGAAT 


TATTAGATGA 


AATCG AAGAG 


8280 




ACATTAAATA 


TTGAAACATA 


C C CTATG AAT 


TGGCCAATTG 


GTATGGGACA 


AAGTTTCTTT 


8340 




GGCATCATTG 


ATAGAAAGTC 


TAAAACAATT 


GAACCATTTA 


GAGATGAAGA 


AAATATATTA 


8400 


30 


CATTTGAATG 


ATGATTTTGA 


GTTGGAAGAA 


GATCATGCAA 


TTACAAATGA 


TAGTGATTTT 


8460 




GAACAAGCGA 


TTGAAGAATT 


AATGTTGGTT 


GAAGAAGCGG 


GTGAAGCCTT 


TGATAATGAC 


8520 




GCGCTGTTGA 


GTGGAGACTT 


& 7A J*" 1 7A PTPT 7A 


TTTTTCGGTT 


CAGCTTTAGC 


TAACTTTGGT 


8580 


35 


GTACAAAATT 


TCTTAAATGC 


ITLTPTTP BT* 


TTTGCGCCAA 


TGCCAAATGC 


GAGACAAACA 


8640 




aaagSagacg TTGAAGTAAG 


v.Lv.ui X XuAl 


GATTCATTTT 


CAGGATTTAT 


CTTTAAAATT 


8700 


40 


CAAGCCAACA 


TGGACCCTAA 




AGAATTGCCT 


TTATGCGTGT 


CGTTAGTGGT t 


8760 


GCATTTGAAC 


GTGGTATGGA 


TGTTACTTTG 


CAACGTACTA ATAAAAAGCA AAAGATCACA 


8820 




CGTTCAACGT 


CATTTATGGC 


AGACGATAAA 


GAAACTGTGA 


ATCATGCTGT 


AGCAGGCGAT 


8680 


45 


ATCATTGGAC 


TATATGATAC 


TGGTAATTAT 


CAAATTGGAG 


ATACTTTAGT 


TGGTGGAAAA 


6940 


CAAACCTACA 


GTTTCCAAGA 


TTTACCACAA 


TTTACGCCAG 


AAATTTTTAT 


GAAAGTTTCT 


9000 




GCTAAAAACG 


TCATGAAACA 


GAAGCATTTC 


CATAAAGGTA 


TTGAACAATT 


AGTACAAGAA 


9060 


50 


GGTGCGATTC 


AATACTATAA 


AACATTACAC 


ACAAACCAAA 


TTATTTTAGG 


TGCTGTTGGT 


9120 




CAGTTACAAT 


TTGAAGTTTT 


CGAACATAGA 


ATGAAAAACG 


AATATAATGT 


TGATGTTGTT 


9180 
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10 



15 



25 



30 



35 



40 



45 



50 



AAGATGAACA 
TTTGAAAATG 
AGTTTACTTT 
TTGAAGAAAA 
ATAACCTATG 
CGACACTTTA 
GTGAGACCTT 
AAAGGAGTTG 
CGTTTTAGTA 
GGTTAAGCAC 
ATTTGTATTT 
TCAAGCTGCA 
TAAACACCCA 
TGAAGAGATT 
AGATATCGCA 
TAAAGTTGGT 
TGGAATGATT 
CAAATATCCA 
AGTTGTCATG 
CGTATTATGG 
AGGTTCAGTT 
ATTTAGTTTA 
AGAATGTGAA 
TTTTTCAATA 
AAGTGATTTT 
AAAAGTCAGT 
TTTAAATCAA 
GCAAATACAT 
TTTAGAACAG 



CATCAAGATC 
AATTTGCAAC 
AACAGCTCAA 
ATAAATTGTA 
GCATTTTGTC 
TCGTCATTAC 
TGCTATTTAT 
TACATGTTAA 
TTTTTAGAAG 
TTACCACCCG 
AGATTTTTAG 
GGAGCGGTTT 
GAAATTGAAA 
AAAGCAAGTA 
TTTGCCATTG 
ATTCACTTTG 
GGTGTTATTC 
GGACTTGAAG 
GTATTAGCGC 
CAATCTATTT 
GTTAAAAATA 
AGGTTGTCCT 
TCAAAAGATG 
GAAATTATAT 
ATTGAATGGA 
nACCsaCGTA 
CATCAAGATA 
AAAGATTCAA 
AGAAACCGTG 



GATTTTAGTG 
AAGATGGTTT 
TTGTATAATC 
TATTTTAAAA 
AGAGGGGAGT 
GArGATATCT 
TTAGCATAGG 
TGGATCCAAG 
GCTTATTAGC 
AACAACGTAA 
CATTATTCTT 
ACTTAATTTA 
GTCCTGAAGC 
ACAAATCATT 
ATTCTATGCT 
GTGGTATGGA 
TAATGCGTTA 
GTGCAGCCTt 
ACCCAGACAT 
TCTGGACAGT 
AAAAATCGCA 
TTTTCATTTA 
CGATATAGTA 
AGATTTTAAA 
GTGGACATTA 
AGCGTCGTGA 
AACAAAATAT 
TTGATAAGCA 
ATGTTAATGA 



AAAGATAGAT 
GAAGAGAAAT 
GAATTTGTTA 
GAAAAAGGTA 
AACTTAAGAA 
TCCGGTAAAG 
TCTTTTTGTT 
TTTGATCTTA 
AGCAGATAAC 
AAAAGCTTTG 
AATTAGTATT 
TATGTCAATC 
TGGAGATGAT 
CTGGGGAACT 
TGCTGCTTTA 
CTTAGGTCAG 
TGCAGCAACA 
CGCGATCGTT 
CGCTGTATTG 
ACTAATTGGA 
TAAATAATTG 
ATTGAGTGAT 
TTAAGAAAAT 
GCAAATTAGG 
GTGGATATTG 
ATTCTTCCAC 
AGATAATACA 
CGAACGTTTT 
GAATAAAGCT 



ATGACGATTT 
TCCCTGAAAT 
CATTAAAAAT 
TACTATGATG 
TCATGACCGT 
TGGGCAATTT 
TGTACTTAAC 
CCTTATTTAT 
GCGATTGTTA 
TTTTACGGTT 
ATCGCGAACT 
AAAAATCTGT 
CATCATTATG 
GTGTTGAAAA 
gCTATTGCTG 
TTCGTAGTCA 
TGGTTTGTAG 
GGTTGGGTAG 
CCTGAGCACT 
TTAGTAATTA 
ATGTGAAGCG 
TTATGAAAAA 
GTGCCTTTTA 
TGTTAATGTG 
GTAAAAAACA 
AACGAAGACA 
ACATCAAAAA 
AAAAATAGTT 
GAAGAAAGTA 



AGTATTCTTA 
TAAATTGTAT 
AATTGTTTCG 
TATCAAATGA 
ATAAATGaTT 
AAATTGCTTA 
TTATTTATTT 
GGGTACTTGT 
TGGCTGTAAT 
TGTTAGGTGC 
TTTGGTTTAT 
GGCAGTTCTT 
ATGAATCTGG 
TAGAATTTGC 
TAACACTTCC 
TGTTCCTAGG 
AGCTATTAAA 
GTGTTAAATT 
TCCCACATGG 
TCGGTTGGTT 
GACAATCTTA 
TGGATTTTGA 
TATTTAGCAT 
TCATAATGAT 
TGTAATTCCT 
GAGAAGAAAA 
AAGCAGATAA 
TATCATCGCA 
AAAGTAATCA 



9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
100B0 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
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w 



15 



20 



25 



30 



35 



40 



45 



50 



ftAAiJ-LAiTA GATTCAGTGG 


ACCAAGATAC 


AGAGAAATCA 


AAATATTATG 


AGCAAAATTC 


11100 


lUAAtaCGACT XT AT CAACT A 


AATCAACCGA 


TAAAGTAGAA 


TCAACTGAAA 


TGAGAAAGCT 


11160 


AAb 1 1 v-ACjAT aaaaacaaag 


TTGGTCATGA 


AGAGCAACAT 


GTACTTTCTA 


AACCTTCAGA 


11220 


ALAltxATAAA GAGACTAGAA 


TTGATTCTGA 


GTCTTCAAGA ACTGATTCAG 


ACAGCTCGAT 


11280 


GCAGACAGAG AAAATAAAAA 


AAGACAGTTC 


AGATGGAAAT 


AAAAGTAGTA ATCTGAAATC 


11340 


TGAAGTAATA TCAGACAAAT 


CAAATACAGT 


ACCAAAATTG 


TCGGAATCTG 


ATGATGAAGT 


11400 


AAATAATCAG AAGCCATTAA 


CTTTACCGGA 


AGAACAGAAA 


TTGAAAAGAC 


AGCAAAGTCA 


11460 


AAATGAGCAA ACAAAAACCT 


ATACATATGG 


TGATAGCGAA 


CAAAATGACA AGTCTAATCA 


11520 


TGAAAATGAT TTAAGTCATC 


ATATACCATC 


GATAAGTGAT 


GATAAAGATA ACGTCATGAG 


11580 


AGAAAATCAT ATTGTTGACG 


ATAATCCTGA 


TAATGATATC 


AATACACCAT 


CATTATCAAA 


11640 


AACAGATGAC GATCGAAAAC 


TTGATGAAAA 


AATTCATGTT 


GAAGATAAAC 


ATAAACAAAA 


11700 


TGCAGACTCG TCTGAAACGG TGGGATATCA AAGTCAGTCA ACTGCATCTC ATCGTAGCAC 


11760 


TGAAAAAAGA AATATTTCTA 


TTAATGACCA 


TGATAAATTA 


AACGGTCAAA 


AAACAAATAC 


11820 


AAAGACATCG GCAAATAATA 


ATCAAAAAAA 


GGCTACATCA 


AAATTGAACA 


AAGGGCGCGC 


11880 


TACGAATAAT AATTATAGTG 


ACATTTTGAA 


AAAGTTTTGG 


ATGATGTATT 


GGCCTAAATT 


11940 


AGTTATTCTA ATGGGTATTA 


TTATTCTAAT 


TGTTATTTTG 


AATGCCATTT 


TTAATAATGT 


12000 


GAACAAAAAT GATCGCATGA 


ATGATAATAA 


TGATGCAGAT 


GCTCaAAAAT 


ATACGACAAC 


12060 


GATGAAAAAT GCCAATAACA 


CAGTTAAATC 


GGTCGTTACA 


GTTGAAAATG 


AAACATCAAA 


12120 


AGATTCmTCA TTACCTAAAG 


ATAAAGCATC 


TCaAGACGAA 


GTGGGATCAG 


GTGTTGTATA 


12180 


TAAAAAATCT GGAGATACGT 


TATATATTGT 


TACGAATGCA 


CACGTTGTCG 


GTGATAAAGA 


12240 


AAAT€aAAAA ATAACTTTCT 


CGAATAATAA 


AAGTGTTGTT 


GGGAAAGTGC 


TTGGTAAAGA 


12300 


TAAATGGTCA GATTTAGCTG 


TTGTTAAAGC 


AACTTCTTCA 


GACAGTTCAG 


TGAAAGAGAT 


12360 


AGCTATTGGA GATTCAAATA ATTTAGTGTT AGGAGAGCCA ATATTAGTCG TAGGTAATCC 


12420 


ACTTGGTGTA GACTTTAAAG 


GCACTGTGAC 


AGAAGGTATT 


ATTTCAGGTC 


TGAACAGAAA 


12480 


TGTTCCTATT QATTTCGATA 


AAGATAATAA 


ATATGATATG 


TTGATGAAAG 


CTTTCCAAAT 


12540 


TGATGCATCA GTAAATCCAG 


GTAACTCGGG 


TGGTGCTGTC 


GTCAATAGAG 


AAGGAAAATT 


12600 


AATAGGTGTA GTTGCAGCTA 


AAATTAGTAT 


GCCAAACGTT 


GAAAnTATGT 


CATTTGCA 


12658 


(2) INFORMATION FOR SEQ ID NO: 128: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6048 base pairs 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

TGAAATnGAA TAGTACTATT GCAAGTGTAA AGAGGTTAAT TTTTGCCnCA CGCGGGACTT 60 

10 AAAAAGGCAA CCACTGGTTG TGACATATCC TTATTTACAT TTATAAATAT AAGGAGGAGG 120 

TAGTAGTGAA AGACTTATTG CAAGCACAGC AAAAGCTTAT ACCGGATCTC ATAGATAAAA 180 

TGTATAAACG TTTTTCTATT CTTACTACTA TCTCAAAAAA TCAGCCTGTC GGACGTCGAA 240 

15 GTTTAAGCGA ACATATGGAT ATGACTGAAC GTGTACTGCG TTCTGAAACA GATATGCTTA 300 

AGAAACAAGA TTTGATAAAA GTTAAGCCTA CCGGAATGGA AATTACAGCT GAAGGTGAGC 360 

AACTGATTTC GCAATTGAAA GGTTACTTTG ATATCTATGC AGATGATAAT CGTCTGTCAG 420 

20 

AAGGTATTAA GAATAAATTT CAAATTAAGG AAGTTCATGT TGTTCCTGGT GATGCTGATA 4 80 

ATAGTCAATC TGTTAAAACA GAATTAGGTA GACAAGCAGG TCAATTACTT GAAGGCATAT 540 

TACAAGAAGA CGCGATAGTT GCTGTAACTG GCGGATCCAC GATGGCATGT GTTAGTGAAG 600 

25 

CAATTCATTT ATTACCATAT AATGTATTCT TCGTACCAGC CAGAGGTGGA CTAGGCGAAA 660 

ATGTTGTCTT TCAGGCAAAC ACAATTGCAG CCAGTATGGc aCAACAAGCT GGCGGTTATT 720 

3Q ATACGACGAT GTATGTACCT GATAATGTCA GTGAAaCAAC ATATAATACA TTGTTGTTAG 780 

AGCCATCAGT CATAAACACT TTAGACAAAA TTAAACAAGC AAACGTTATA TTACACGGCA 84 0 

TTGGTGATGC GCTGAAGATG GCGCATCGAC GTCAATCACC TGAAAAGGTC ATTGAACAAC 900 

35 TTCAACATCA TCAAGCTGTC GGAGAGG CAT TTGGTTATTA TTTTGATACA CAAGGTCAAA 960 

TTGTCCATAA GGTTAAAACA ATTGGACTTC AATTAGAAGA CCTTGAATCA AAAGACTTTA 1020 

TTTTTGCAGT TGCAGGAGGC AAATCGAAAG GTGAAGCAAT TAAAGCATAC TTGACGATTG 1080 

40 

CACCCAAGAA TACAGTGTTA ATCACTGATG AAGCCGCAGC AAAGATAATA CTTGAATAAG 1140 

AGATAAAAAG TTTAATACTT TTTAAATATC ATTTTAAAGG AGGCCATTAT AATGGCAGTA 1200 

AAAGTAGCAA TTAATGGTTT TGGTAGAATT GGTCGTTTAG CATTCAGAAG AATTCAAGAA 126 0 

45 

GTAGAAGGTC TTGAAGTTGT AGCAGTAAAC GACTTAACAG ATGACGACAT GTTAGCGCAT 1320 

TTATTAAAAT ATGACACTAT GCAAGGTCGT TTCACAGGTG AAGTAGAGGT AGTTGATGGT 1380 

50 GGTTTCCGCG TAAATGGTAA AGAAGTTAAA TCATTCAGTG AACCAGATGC AAGCAAATTA 144 0 

CCTTGGAAAG ACTTAAATAT CGATGTAGTA TTAGAATGTA CTGGTTTCTA CACTGATAAA 1500 

GATAAAGCAC AAGCTCATAT TGAAGCAGGC GCTAAAAAAG TATTAATCTC AGCACCAGCT 1560 
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ACAGTTGTTT CAGGTGCTTC ATGTACTACA AACTCATTAG CACCAGTTGC TAAAGTTTTA 1680 

AACGATGACT TTGGTTTAGT TGAAGGTTTA ATGACTACAA TTCACGCTTA CACAGGTGAT 1740 

5 

CAAAATACAC AAGACGCACC TCACAGAAAA GGTGACAAAC GTCGTGCTCG TGCAGCGGCA 1800 

GAAAACATCA TCCCTAACTC AACAGGTGCT GCTAAAGCTA TCGGTAAAGT TATTCCTGAA 1860 

ATCGATGGTA AATTAGATGG TGGTGCACAA CGTGTTCCTG TAGCTACAGG TTCATTAACT 1920 

10 

GAATTAACAG TAGTATTAGA AAAACAAGAC GTAACAGTTG AACAAGTTAA CGAAGCTATG 1980 

AAAAATGCTT CAAACGAATC ATTCGGTtAC ACTGAAGACG AAATCGTTTC TTCAGACGTT 204 0 

J5 GTAGGTATGA CTTACGGTTC ATTATTCGAC GCTACACAAA CTCGTGTAAT GTCAGTTGGC 2100 

GACCGTCAAT TAGTTAAAGT TGCAGCTTGG TATGATAACG AAATGTCATA TACTGCACAA 2160 

TTAGTTCGTA CATTAGCATA CTTAGCTGAA CTTTCTAAAT AATTTTAGTA TAGTTTTTAT 2220 

20 TCAAATACGC TAGTGCTCAG AACTATTTAG CATTAATTAA AGCTTATGAG TAAGCGGGGA 2280 

GCACAAACGC TTCTCCGCTT ATTTTTATAT AAAATTTCCT AATTACAAGG AGGAAACACC 2340 

ATGGCTAAAA AAATTGTTTC TGATTTAGAT CTTAAAGGTA AAACAGTCCT AGTACGTGCT 2400 

25 

GATTTTAACG TACCTTTAAA AGACGGTGAA ATTACTAATG ACAACCGTAT CGTTCAAGCT 2460 

TTACCTACAA TTCAATACAT CATCGAACAA GGTGGTAAAA TCGTACTATT TTCACATTTA 2520 

GGTAAAGTGA AAGAAGAAAG TGATAAAGCA AAATTAACTT TACGTCCAGT TGCTGAAGAC 2580 

30 

TTATCTAAGA AATTAGATAA AGAAGTTGTT TTCGTACCAG AAACACGCGG CGAAAAACTT 264 0 

GAAGCTGCTA TTAAAGACCT TAAAGAAGGC GACGTATTAT TAGTTGAAAA TACACGTTAT 2700 

35 GAAGATTTAG ACGGTAAAAA AGAATCTAAA AATGATCCAG AATTAGGTAA ATACTGGGCA 2760 

TCTTTAGGTG ATGTGTTTGT AAATGATGCT TTTGGTACTG CGCATCGTGA GCATGCATCT 2820 

AATGTTGGTA TTTCTACACA TTTAGAAACT GCAGCTGGAT TCTTAATGGA TAAAGAAATT 2880 

^0 AAGTTTATTG GCGGCGTAGT TAACGATCCA CATAAACCAG TTGTTGCTAT TTTAGGTGGA 2940 

GCAAAAGTAT CTGACAAAAT TAATGTCATC AAAAACTTAG TTAACATAGC TGATAAAATT 3000 

ATCATCGGCG GAGGTATGGC TTATACTTTC TTAAAAGCGC AAGGTAAAGA AATTGGTATT 3060 

45 

TCATTATTAG AAGAAGATAA AATCGACTTC GCAAAAGATT TATTAGAAAA ACATGGTGAT 3120 

AAAATTGTAT TACCAGTAGA CACTAAAGTT GCTAAAGAAT TTTCTAATGA TGCCAAAATC 3180 

ACTGTAGTAC CATCTGATTC AATTCCAGCA GACCAAGAAG GTATGGATAT TGGACCAAAC 3240 

50 

ACTGTAAAAT TATTTGCAGA TGAATTAGAA GGTGCGCACA CTGTTGTATG GAATGGACCT 3300 

ATGGGTGTAT TCGAGTTCAG TAACTTTGCA CAAGGTACAA TTGGTGTATG TAAAGCAATT 3360 
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TCTTTAGGTT TTGAAAATGA CTTCACTCAT 

TACCTAGAAG GTAAAGAATT GCCTGGTATC 

5 

AGTTTAAAGT GATGTGGCAT GTTTGTTTAA 

CATCGTGTTT CATCACTTTT CAAAAATATT 

ACCAATTATA GCTGGTAACT GGAAAATGAA 

10 

AATACATTAC CAACACTACC AGATTCAAAA 
ATTCAATTAG ATGCATTAAC TACTGCAGTT 

1$ GGTGCTCAAA ATACGTATTT CGAAGATAAT 

GCATTAGCAG ATTTAGGCGT TAAATACGTT 
TTCCACGAAA CAGATGAAGA AATTAACAAA 

20 ACTCCAATTA TATGTGTTGG TGAAACAGAC 

GTTGTAGGTG AGCAAGTTAA GAAAGCTGTT 
GTTGTAATTG CTTATGAACC AATCTGGGCA 

25 GATGCAAATG AAATGTGTGC ATTTGTACGT 

GTATCAGAAG CAACTCGTAT TCAATATGGT 
TACATGGCAC AAACTGATAT TGATGGGGCA 

30 

GATTTCGTAC AATTGTTAGA AGGTGCAAAA 
TTATTTTAGA TGGTTTTGCG AACCGCGAAA 
ACAAGCCTAA TTTTGATCGT TATTACAACA 
GCTTAGATGT TGGACTACCT GAAGgACAAA 
TCGGTGCAGG ACGTATCGTT TATCAAAGTT 

40 GTGATTTCTT TGAAAATGAT GTTTTAAATA 

CAGCGTTACA CATCTTTGGT TTATTGTCTG 
TATTTGCTTT GTTAGAACTT GCTAAAAAAC 

45 TTTTAGATGG CCGTGACGTA GATCAAAAAT 

CTAAATTCAA TGAATTAGGC ATTGGTCAAT 
TGGATCGTGA CAAACGTTGG GAACGTGAAG 

50 

ATGCCCCAAC TTATGCAACT GCCAAAGAAG 
CTGACGAATT CGTAGTACCA TTCATCGTTG 

55 



' ATTTCAACTG 


GTGGCGGCGC 


GTCATTAGAG 


3480 


: AAAGCAATCA 


ATAATAAATA 


ATAAAGTGAT 


3540 


i CATTGTTACG 


GGAAAACAGT 


CACAAGATGA 


3600 


' TACAAAACAA 


GGAGTGTCTT 


TAATGAGAAC 


3660 


» CAAAACAGTA 


CAAGAAGCAA AAGatTCGTC 


3720 


GAAGTAGAAT 


CAGTAATTTG 


TGCACCAGCA 


3780 


AAAGAAGGAA 


AAGCACAAGG 


TTTAGAAATC 


3840 


GGTGCGTTCA 


CAGGTGAAAC 


GTCTCCAGTT 


3900 


GTTATCGGTC 


ATTCTGAACG 


TCGTGAATTA 


3960 


AAAGCGCACG 


CTATTTTCAA 


ACATGGAATG 


4020 


GAAGAGCGTG 


AAAGTGGTAA 


AGCTAACGAT 


4080 


GCAGGTTTAT 


CTGAAGATCA 


ACTTAAATCA 


4140 


ATCGGAACTG 


GTAAATCATC 


AACATCTGAA 


4200 


CAAACTATTG 


CTGACTTATC 


AAGCAAAGAA 


4260 


GGTAGTGTTA 


AACCTAACAA 


CATTAAAGAA 


4320 


TTAGTAGGTG 


GCGCATCACT 


TAAAGTTGAA 


4380 


TAATCATGGC 


TAAGAAACCa 


ACTGCGTTAA 


4440 


GCGAACATGG 


TAATG CGGTA 


AAATTAGCAA 


4500 


AATATCCAAC 


GACTCAAATC 


GAAGCGAGTG 


4560 


TGGGTAACTC 


AGAAGTTGGT 


CATATGAATA 


4620 


TAACTCGAAT 


CAATAAATCA 


ATTGAAGACG 


4680 


ATGCAATTGC 


ACACGTGAAT 


TCACATGATT 


4740 


ACGGTGGTGT 


ACACAGTCAT 


TACAAACATT 


4800 


AAGGTGTTGA 


AAAAGTTTAC 


GTACACGCAT 


4860 


CCGCTTTGAA 


ATACATCGAA 


GAGACTGAAG 


4920 


TTGCATCTGT 


GTCTGGTCGT 


TATTATGCAA 


4980 


AAAAAGCTTA 


CAATGCTATT 


CGTAATTTTG 


5040 


GTGTAGAAGC 


AAGCTATAAT 


GAGGGCTTAA 


5100 


AGAATCAAAA 


TGACGGTGTT 


AATGATGGAG 


5160 
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CGAACAGAGC 


ATTCGAAGGC 


TTTAAAGTTG 


AACAAGTTAA 


AGACTTATTC 


TATGCAACAT 


5280 


TCACTAAGTA 


TAATGACAAT 


ATCGATGCGG 


CTATCGTCTT 


CGAAAAAGTT 


GATTTAAATA 


5340 


ATACAATTGG 


TGAAATTGCA 


CAAAATAACA 


ATTTAACTCA 


ATTACGTATT 


GCAGAAACTG 


5400 


AAAAATACCC 


TCACGTTACT 


TACTTTATGA 


GTGGTGGACG 


TAACGAGGAA 


TTTAAAGGTG 


5460 


AACGCCGTCG 


TTTAATTGAT 


TCACCTAAAG 


TTGCAACGTA 


TGACTTGAAA 


CCAGAAATGA 


5520 


GTGCTTATGA 


AGTTAAAGAT 


GCATTATTAG 


AAGAGTTAAA 


TAAAGGTGAC 


TTGGACTTAA 


5580 


TTATTTTAAA 


CTTTGCTAAC 


CCTGATATGG 


TTGGACATAG 


TGGTATGCTT 


GAGCCGACAA 


5640 


TCAAAGCAAT 


CGAAGCGGTT 


GATGAATGTT 


TAGGAGAAGT 


GGTTGATAAG 


ATTTTAGACA 


5700 


TGGACGGTTA 


TGCAATTATT 


ACTGCTGACC 


ATGGTAACTC 


TGATCAAGTA 


TTGACGGaTG 


5760 


ATGATCAACC 


AATGACTACG 


CAvACAACGA 


ACCCAGTACC 


AGTGATTGTA 


ACAAAAGAAG 


5820 


GCGTTACACT 


TAGAGAAACT 


GGTCGCTTAG 


GTGACTTAGC 


ACCTACATTA 


TTAGATTTAT 


5880 


TAAATGTAGA 


ACAACCTGAA 


GATATGACAG 


GTGAaTCTTT 


AATTAAACAC 


TAATATTGTA 


5940 


AAAGATGTTA 


AGTAAACGCT 


TAATGACACT 


TATTTTTTGA 


AAATAATAGT 


AATATCnTTT 


6000 


TGTTAAATGA 


AAGAATAAAG 


CTATAATAAT 


TATAGAATAA 


CTATTTAn 




6048 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 5602 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 
AAAGAAGTGC AAGATATCAT CGCATTAATT AAGTCGTTAC AAAgTGTAAT TGTAGACaTC 60 

40 GCTTCCAATA ATGTTGATAC AATTATGCCT GGTTATACTC ATTTACAGCG TGCACAGCCA 120 

ATTTCATTTG CACATCATAT TATGACTTAT TTTTGGATGT TACAACGAGA CCAACAACGA 180 
TTTGAAGATA GTTTAAAACG AATCGATATT AATCCTTTAG GTGCAGCAGC CTTAAGTGGT 240 

45 ACCACATACC CTATCGATAG ACACGAGACA ACAGCATTGT TGAACTTTGG CAGTCTCTAT 300 

GAGAATAGCC TAGATGCTGT TAGTGACAGA GACTATATTA TTGAAACATT GCATAATATT 360 
TCTTTAACGA TGGTTCACTT ATCACGCTTT GCAGAGGAAA TTATTTTCTG GTCCACAGAC 420 

50 

GAAGCTAAAT TCATTACATT ATCAGATGCA TTTTCAACTG GCTCATCTAT TATGCCACAA 480 
AAGAAAAATC CTGATATGGC AGAATTAATT AGAGGTAAAG TTGGTCGAAC GACTGGTCAT 54 0 

55- 
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25 



GAAGATAAAG AAGGTTTATT CGATGCTGTC CATACAATTA AAGGTTCTTT ACGTATTTTC 660 

GAAGGTATGA TTCAAACGAT GACAATTAAT AAAGAACGAC TCAATCAAAC TGTTAAAGAA 720 

GATTTTTCAA ATGCAACGGA ACTAGCAGAT TATTTAGTAA CTAAAAATAT TCCATTTAGA 780 

ACTGCACATG AAATTGTAGG AAAAATCGTC TTAGAATGTA TACAACAAGG TCATTATTTA 840 

TTAGATGTTC CTTTAGCAAC ATATCAACAA CATCATTCTA GTATTGATGC CGATATTTAC 900 

GATTATTTGC AGCCTGAAAA TTGTTTAAAA CGACGTCAAA GTTACGGTTC AACAGGTCAA 960 

TCATCGGTCA AACAACAACT TGATGTTGCT AAACAATTAC TATCACAATA AATACGTTAA 1020 

15 TCTACCTACC CACAATGTCT ATTAAAATTA CATTGTGGGT ATTTTAATGC TCTCTTCGTC 1080 

TTGTTGAACA TCACATTTTT AAGATTCCTA AAATGTTTGA TAATTCTTTT AAATTTATAT 1140 

TACAAAAATG TTATAAATTG TAAAAGAAAT GTGTAAAGCG TTTTCACAAG CAGGTTTTTG 1200 

TAGTATTTTA AAATTGTTAG ACTACAAATA AAGAGATGAA AGGATAAAGA CTATGACTAA 1260 

CTCTTCGAAA AGCTTCACTA AATTTATGGC TGCTTCTGCT GTTTTTACTA TGGGATTTTT 1320 

ATCAGTACCT ACTGCTGGCG CTGAACAAAC AAATCAAATT GCAAATAAAC CTCAGGCTAT 1380 

TCAATGGCAT ACAAATTTAA CGAATGAGCG ATTCACTACT ATCGCACATC GTGGCGCAAG 1440 

TGGCTATGCA CCCGAGCATA CGTTTCAAGC ATATGATAAG AGTCATAATG AGTTAAAAGC 1500 

ATCTTATATC GAAATTGATT TACAACGTAC CAAAGATGGC CATTTAGTTG CTATGCATGA 1560 

TGAAACTGTT AACCGTACAA CAAATGGACA CGGTAAAGTT GAGGATTATA CCCTTGATGA 1620 

ATTAAAACAG TTAGATGCAG GAAGTTGGTT TAATAAAAAA TATCCAAAAT ACGCAAGAGC 1680 

35 AAGTTATAAA AATGCTAAAG TACCCACTTT AGATGAAATT TTAGAACGTT ATGGCCCGAA 174 0 

TGCAAACTAT TATATTGAAA CAAAGTCACC TGATGTATAC CCAGGAATGG AAGAACAATT 1800 
ATTAGCTTCA TTGAAAAAGC ATCACCTTTT AAATAACAAT AAATTAAAAA ATGGACATGT - 1860 

40 AATGATTCAA TCATTTTCTG ACGAAAGTTT AAAGAAAATT CATCGTCAAA ATAAGCATGT 1920 

GCCATTAGTA AAATTAGTTG ATAAAGGTGA ACTACAACAA TTTAACGACC AACGCTTAAA 1980 

AGAGATACGC TCTTATGCGA TTGGATTAGG TCCTGATTAT ACAGATTTAA CTGAACAAAA 204 0 

TACCCATCAT TTAAAAGACT TAGGATTTAT AGTACATCCT TATACAGTGA ATGAAAAAGC 2100 

TGATATGTTA CGATTAAATA AATATGGCGT TGATGGTGTC TTTACAAATT TCGCTGATAA 2160 

ATATAAAGAA GTCATTAAGT AGTAATGTTA AACTAGAAAA CATAAATACA AAAATATAGC 2220 

TATTACTATA AAAAACAGCA GTAAGATATT TCCAAATTGA AATTATCCTA CTGCTGTCTT 2280 

TTTGGGAGTG GGACAGAAAT GATATTTTCG CAAAATTTAT TTCGTCGTCC CACCCCAACT 234 0 



45 



SO 
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TTGTCTGTAG AAATTGAGGA GCTAATTTCT CTGTGTCGGG GCTCCACCCC AACTTGCACA 24 60 

CTATTGTAAG CTGACTTTCC GCCAGCCTCT GTGTTGGGGC CCCGCCAACT TGCACACTAT 2520 

5 

TGTAAGCTGA CTTTCCACCA GCCTCTGTGT TGGGGCCCCG ACTATTTTTG AAAAGAGCGT 25 80 

GTTACACGGG CATTGTTTTA CAGTCAACTA CTGCTAAAAT AAAATTAACG AGCTTAGGGC 2640 

TTTGTTTTCT GTCCCAAGCT CGTTAAATCA CATATGATAA TTAATTATGC CCAACCACGA 2700 

10 

TATCTAGCTG CTTCTGCTGT ACGTTTAATA CCTATGATAT ATGCTGCAAG TCTCATATCT 2760 

ATTTTTCGGT TTTGAGACAA TTCGTAAATC GTATCAAATG CCGCTTCTAA TTTTTCACGT 2820 

15 AGCTTTTCAT TAACTTCTTC TTCAGACCAA TAATAACCTT GATTATTTTG TACCCATTCG 2880 

AAGTAAGAAA CCGTtACACC ACCAGCACTT GCTAATACGT CTGGAACTAA TAATATACCA 294 0 

CGTTCAGTTA AAATACGTGT TGCTTCTGGT GTTGTAGGTC CATTAGCAGC TTCAACAACG 3000 

20 ATACTAGCTT TAATATCATG TG CATTGTCT TCTGTAATTT GGTTTGAAAT AGCCGCTGGT 3060 

ACTAAAATGT CACAATCTAA TTCAAACAAT TCTTTATTTG AGATTGTTTC TTCAAATAAA 3120 

TTTGTTACCG TACCAAAACT ATCACGACGG TCTAATAAAT AATCTATATC TAAGCCATTT 3180 

25 

GGATCGTGTA ATGCACCGTA AGCATCAGAG ATACCTACAA TTTTTGCACC TAAATCATAT 324 0 

AAGAATTTAG CTAAGAAACT TCCGGCATTA CCGAAACCTT GAATAACAAC CTTGGCACCT 3300 

TCAATTTGCA TATTACGACG TTTTGCAGCT TGTTCAATTG CAATAACTAC ACCTAGTGCA 3360 

30 

GTTGATCTGT CGCGTCCATG AGAACCACCC AATACAATTG GTTTACCTGT GATGAAACCT ' 3420 

GGTGAATTAA ATTTATCTAA TGCACTATAT TCATCCATCA TCCAAGCCAT AATTTGTGAG 3480 

35 TTTGTAAATA CATCTGGTGC TGGAATATCT TTGTTCGGAC CTACGAATTG TGAAATTGCT 3540 

CTTACATATC CGCGTGATAA ACGTTCAACT TCATGAATGC TCATTTGACG TGGATCACAA 3600 

ACGATACCAC CCTTACCACC ACCGTATGGT AAGTTTACAA TGCCACATTT CAAAGTCATC 3660 

40 CACATTGATA ATGCTTTTAC TTCTTCTTCA TCAACATCTG GGTGGAAACG CACGCCCCCT 3720 

TTTGTTGGTC CAACAGCATC ATTATGTTGC GCACGGTAAC CTGTGAATGT TTTTACTGTG 3780 

CCATCATCCA TTCGTACAGG GATACGCACT TGTAACATTC TTAAAGGTTC TTTAATTAAA 3840 

45 

TCGTACATTC CTtCGTCAAA TCCCAATTTA TGCAATGCTT CTTTAATAAT TCCTTGAGTA 3900 

GAAGTTACTA AATTATTGTT CTCAGTCATG ATCCTTTTCG CCTCTTCTTT ACCTAATGAT 3960 

TTCGCTTTCA AACATATTGT AACATAACGT ATTCCTTTTT AAAGCCCTTA CAAACTGATT 4020 

SO 

GTTACAACTT TTTGACATTA TTGAAATACA TGTCTTATTT TTTCAAGTGC AAGGTCCAAT 4 080 

TCTTCTTTAG TAATAATTAA TGGTGGTGCA AAACGAATGA CAGTATCATG CGTTTCTTTA 4140 
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ACACCTATAA ACAAACCACG TCCACGGACT TCTTTAATTG ATGGATGATC AATTTGCTTT 4260 

AATTGTTCTT TAAAATAATC TCCTAATTCT AAAGAGCGGC CTGGTAAATC CTCATCAACG 4320 

ATAACATCTA ATGCAGCAAT TGATGCAGCA CAAGCAAGTG GATTACCACC AAATGTTGAA 4380 

CCATGTGAGC CAGGTGTAAA GACATCTAAT ACTTCTTTAT CTGCTAATAC AACAGAAATT 4440 

GGGAAGACTC CACCACCTAG TGCTTTACCT AAAATATAGA CATCAGGTTT TACATTATCC 4500 

CAATCCGTAG CAAATAATTT ACCCGAACGA CCTAATCCTG CTTGGATTTC GTCAGCAATA 4560 

AATAAGACAT TATGTTCATC ACATAATTCT CTAATTGCTT TCAAATATCC TTCTGGCGGT 4620 

ATATTTATAC CCGCTTCACC TTGAATTGGT TCTACTAAAA CTGCTGCAGT ATTTTCATTA 4680 

ATTGCAGCTT TCAATGCATC TACATCTCCA AAATCAACTT TTCTAAATCC ATCTAATAAC 474 0 

GGACCATAAC CACGTTGGTA TTCTGCTTCT GAAGATAATG AAACTGGCGC CATTGTTCGA 4 800 

CCATGGAAGT TACCATTAAA TGCAATGATT TCTGCTTTAT TTGGCTCAAT TCCTTTAACA 4 860 

TCGTATGCCC AGCGTCGTGC TGCTTTCAAA GCTGTTTCTA CTGCTTCAGC ACCTGTATTC 4 920 

ATTGGTAAAG CTTTATCTTT ACCTGCCAGT TTACAAATTT TTTCGTACCA TTCACCTAAG 4980 

TTATCACTAT GAAAAGCACG TGAAACTAAA GTCACTTTAT CAGCTTGATC TTTTAATGCT 5040 

TGAATAATTT TCGGATGTCT ATGACCTTGG TTAACAGCGG AATATGCAGA TAACATATCC 5100 

ATATATTTAT TGCCTTCAGG ATCTTTAACC CATACCCCTT CAGCTTcTGa AATGaCAATT 5160 

GGcAATGGTA AATAATTATG TGCTCCGTAA TGATTTGTTA ACTCAATAAT TTTTTCAGAT 5220 

TTAGTCATCA TATCTCCCCT TTTCATCATT TATAACTATT ATACATGAAA CATTATCCAA 5280 

ATAATTACAT TAGTTTTCAA AGCAGATACT TTTCCACCAA AAAAGATGAA ATAATCACTA 5340 

AGTTTCATTA AATTTGTCTA TTTTGAAAAC CCTTACATTT ATAATGACAT AATTACTTAA 54 00 

ATGaJTACAA GCAAAAGAAT TGATAATTTT ACACTTAATC AAAAGTATAT TTTACTAAGA 5460 

ATAT*TTTTAT TTATAAATAT TGAAAACCAC TAACAAATTG CATACACAAT ATCATTAGTG 5520 

GTAACAGTTA AACACTTATT TATCTTTACG GGGTAATGGG TTAAAACCCT TnCATTAAAA 5580 

TTGGATGnCC ATAAAATTAG GG 5602 
(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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TAACCCCATT TTACCTGGAA AAATCgTTTG 
TTTACGTATa GAATTATAAA AgGTTTCATT 
5 TTTTATGTCA AATTTAAAAC AGTAACACTT 

AAAGGAAGTG TCGCGTATTT TAACTTTTTC 
TTTAATGAAT ACTTATTGCA ATCAATGTCC 

10 

GAAAACGAAG GCGCATCATT TTTGTATCAA 
ATTAGGAAAT GAACTTCAAT AGGAGGAAGT 

15 CCGAACCACA ACACAATGAA GATTACACTT 

ACGTATACTA AAGTTGATGA TTCACAGCCA 
GGCGTTAAAT CAATTTTCCA TGTTATGGAC 

20 AATTGGGAAA CAGTATTGCC AAAAGTAGAG 

GTATTCGGGG GGAATAAAGT ATATGGAAAT 
TACAATGAAA GTTGTTTTGT CATATACAAG 

25 AGTAGAAGAA ACACAACCAA GATTTATAAA 

CATTTTTCAT GTCATGAACT TCTTAGCTGT 
CATATTACCT GATATTAAAG CTGCTTTTTC 

30 

TGAACCTCAA ATTGACAATC ATTTTGGTGA 
TATACCGTAT CAAATTAAGC TAACTTCTGC 
AACATATGTT GACCATATGA CTCAAGCGCA 

35 

TAAATGGCTA GATTTAGGAA ATCGCTATGG 
AGAAGAAGTG CTAGCTACCT ATCCAGAATC 

40 AGAAGAAAAT CACGCAACTA ATAATTATCA 

TCATGCAACT GATAATTGGA AGACTCGATT 
TTTTGAAGAT ATACCGCTGC TTGATTTAGC 

45 TCAAGCGATT GTATTATTAG GTATGATTGA 

GGGGCTTCGT GATAAAAGTC CTGCTGTAAG 
AGGGTATCCA GAGGCACTAC CAGAAATGGT 

SO 

TAGGTGGCGT GCTGCTATGT -TTATCTTTGA 
AAAAGCCCAT ATTAATGACA ATGCGTTTGA 

55 



CGATGCaATm GCaTTtGaAT ATAaATACAT 60 

CaAATCTTAG GGTCAAAAAT GTTATAATAT 120 

ATTTACAAGG TTGCAATATT TTGAAGTAAT 180 

AGAGCAAAAT GCACTCGCGA AAATAGATGA 240 

AATCAAAACT CGTCTGCGTA AATTAGAGGG 300 

TGAGTGTTCA ATAGGGAAAG AAATAAAACA 360 

CAAATGAAAA TTATATCTAT ATCAGAAACA 420 

AGTGAAAGCA GAGAAGGTAT GACATCAGAT 4 80 

GCATTTATTA ATGACATCTT AAAGGTTGAA 540 

TTTATTTCAG TAGATAAAGA AAATGACGCA 600 

GCTGTATTCG AATAAATTTT TCATCAACTA 660 

TTTACGTATA GAGCCAACAC CAAGTCCAAA 720 

AGAAGACAAG TTATCTAATA CTTATAAAAA 780 

TCAGTTGTTA TCTATAGATG GTATCACTTC 84 0 

TGATAAGGCA CCAAAAGCTG ATTGGGAAGT 900 

TGATGCGAAT AAGGTTTTAG AATCTGTAAA 960 

AATTAAAGCT GAATTATTAA CTTTTAAGGG 1020 

TGACCAAGAA TTAAGAGAAC AATTACCACA 1080 

AACAGCACAT GACAATATTG TTTTTATGCG 1140 

AAATATTCAA GAAGTAATGG ATGGTGTCCT 1200 

ACAGTTACCC GTATTGGTAA AACATGCTTT 1260 

TTTCTATCGA CATGTCTCTT TGGATGAATA 1320 

AGGAATGTTA AACCATTTTC CAAAGCCGAC 1380 

TTTATCTGAT GAAAAAGTAC CGGTTAGACG 144 0 

AAGTAAAGAA ATTTTACCGT ATTTATATAA 1500 

AAGAACAGCA GGGGATTGCA TAAGCGATTT 1560 

GCTACTATTA GATGATCCAC AGAAAATCGT 1620 

TGAAGGTAAT GCAGAGCAGC TTCCCGCACT 1680 

AGTTAAATTA CAAATTGAAA TGGCCATATC 1740 
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AATTTAATTG GAGGAATTAA ATATGAATGC ATATGATGCT TATATGAAAG AAATTGCGCA i860 

ACAAATGCGT GGCGAATTAA CTCAAAATGG TTTTACAAGT TTAGAAACGA GCGAACAGct 1920 

ATCGGAGTAT ATGAACCAAG TAAATGCTGA TGACACTACT TTTGTAGTTA TTAACTCTAC 1980 

ATGCGGCTGT GCAGCTGGAT TAGCAAGACC AGCTGCAGTA GCAGTTGCAA CACAAAATGA 2040 

ACATAGACCT ACAAATACAG TTACAGTTTT TGCTGGGCAA GATAAAGAAG CAACTGCTAC 2100 

AATGCGAGAA TTCATTCAGC AAGCACCATC TAGTCCTTCG TATGCTTTAT TCAAAGGTCA 2160 

AGATTTAGTT TATTTTATGC CTAGAGAATT TATCGAAGGT AGAGATATTA ATGACATTGC 2220 

AATGGACTTA AAGGATGCCT TTGACGAAAA TTGTAAATAG TACACATAAA TAAATATAAA 2280 

GGTTAACACA TTTTATAATA TTAAAAATGG TGTCTGTCAT TGAAAATAGA GAATATAGTT 234 0 

GTATTCTATT TGTTAAATAA AGTCCGTTTT TACCaACTAT ATTTTCTAGA AATTTAACTG 2400 
20 TTTTAATAGG ACATCAAACA TAATATTCaA ATCaTGTGTT AACCTCTTTT TTAAAATTTT ■ 2460 

TTAGCATTAA AGTTATAGAT TTGGGTAAAC AATTACCAAT TGGAAACATA TATCACGTTA 252 0 

CGATGGGGTA GGTACTTAAT CAGCATTTTA TAAATAAAGT AACGGAATTC ATGATATTAA 2580 

TATCATATTC CTAAAATGAG TGATAACAAA ATGCTACATA AAGTTAAGTT ATATCAAACT 264 0 

AAATATACAT ACTATAAATA ATGAAAATGA GGTGTTATCG CATATGTTGA ATTCATTTGA 2700 

TGCAGCATAT CACAGTCTTT GTGAAGAAGT TTTAGAAATA GGAAATACAC GAAATGATCG 2760 

CACAAATACA GGTACGATTT CGAAATTTGG TCATCAACTT CGCTTTGACT TATCTAAAGG 282 0 

ATTTCCACTA TTAACGACAA AGAAAGTTTC TTTTAAATTA GTAGCAACCG AATTATTATG 2860 

GTTCATTAAA GGAGATACAA ACATCCAATA CTTATTAAAA TATAATAATA ATATATGGAA 294 0 

CGAATGGGCT TTTGAAAATT ATATCAAATC AGACGAGTAT AAAGGTCCAG ATATGACAGA 3000 

TTTCZ5GGCAT CGTGCATTGA GTGATCCTGA ATTTAACGAA CAATATAAAG AACAAATGAA 3060 

40 ACAATTTAAG CAACGTATTC TTGAAGATGA TACATTTGCG AAGCAATTCG GGGATTTAGG 3120 

AAATGTTTAT GGTAAACAAT GGCGAGATTG GGTTGATAAA GATGGTAATC ATTTTGATCA 3180 

ACTTAAAACA GTAATTGAAC AAATTAAGCA TAATCCAGAT TCAAGGCGAC ACATCGTATC 324 0 

TGCATGGAAT CCAACAGAAA TTGATACAAT GGCACTTCCG CCTTGTCATA CCATGTTCCA 33 00 

GTTTTATGTC CAAGATGGTA AGTTAAGTTG CCAGTTATAC CAACGTAGCG CAGATATCTT 33 60 

TTTAGGTGTG CCATTTAATA TCcGCagctA CGCTTTATTG ACACACCTTA TTGCCAAAGA 3420 

ATGTGGACTT GAAGTGGGTG AATTTGTGCA TACATTTGGA GATGCACATA TTTATTCAAA 34 80 

TCATATTGAT GCGATTCAAA CACAATTAGC ACGTGAAAGC TTCAATCCTC CAACATTAAA 3540 
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TGAATCACAT CCAGCAATAA AAGCTCCAAT AGCAGTGTAG TCATTGCATA GTTAGCTAAC 3660 

CATATAGACA TCAAAATGAC ATCATAGTAT TTTCAAGTGC AAAAAAGTAC TTTTTTGTGT 3720 

TAAACGTTTT CATAAATTAT GCAAAATCAT TATTTCTATC ACACTTTATG ATAAAAATTG 3780 

TGTTAAATTA AAGATAACTT AGTAATAAAA AATGAAATGA TAGAAGAAGG AGGATAATTA 3840 

TGACTTTATC CATTCTAGTt GCACATGACT TGCAACGAGT AATTGGTTTt GAAAATCAAT 3900 

TACCTTGGcA CCTACCAAAT GATTTGAAGC ATGTTAAAAA ATTATCAACA GGTCATACTT 3960 

TAGTAATGGG TCGTAAGACA TTTGAATCGA TTGGTAAACC ACTACCGAAT CGTCGAAATG 4020 

TTGTACTTAC TTCAGATACA AGTTTCAACG TAGAnGGCGT TGATGTAATT CACTCTATTG 4080 

AAGATATTTA CCAACTACCG GGCCATGTTT TCATATTTGG AGGGCAAACA TTATTTGAAG 414 0 

AAATGATTGA TAAAGTGGAC GACATGTATA TTACTGTTAT TGAAGGTAAA TTCCGTGGTG 4200 

20 ATACGTTCTT TCCACCTTAT raCATTkGAgr CTGGGAAGTT GCCTCTTCAG TTGAAGGTAA 4 260 

ACTAGATGAG AAAAATACAA TTCCACATAC CTTTCTACAT TTAATTCGTA AAAAATAAGG 4320 

GGGAAAACGA CCATGACAAA ACAGATTATA GTAACAGACT CAACATCCGA TTTATCTAAA 4380 

GAATACTTAG AAGCAAACAA CATTCATGTA ATTCCTTTAA GTTTAACTAT TGAAGGAGCT 4440 

TCATACGTTG ACCAAGTAGA TATTACATCA GAAGAATTTA TTAATCATAT TGAAAATGAT 4 500 

GAAGATGTAA AGACAAGTCA GCCAGCCATA GGTGAATTTA TATCTGCTTA TGAAGAACTA 4560 

GGAAAAGATG GCTCTGAAAT CATAAGTATT CATCTTTCTT CAGGATTAAG TGGTACATAT 4620 

AACACTGCTT ACCAAGCAAG TCAAATGGTA GATGCTAATG TAACTGTTAT TGATTCAAAA 4680 

TCTATTTCTT TTGGTTTAGG GTATCAAATA CAACACCTAG TAGAGCTTGT AAAAgAaGGT 4740 

GtCTCAACTT CTGAAATAGT TAAAAAGTTA AATCATTTAA GAGAAAACAT TAAATTATTT 4 800 

GTAGTTATAG GGCAATTGAA TCAATTAATT AAAGGTGGCA GAATTAGTAA AACAAAAGGT 4 860 

40 TTGATTGGTA ATCTTATGAA AATTAAACCA ATTGGTACAC TAGATGATGG TCGCTTAGAG 4 920 

CTTGTGCmCA ATGCGAGAAC TCaAAATTCk AGTATCCAAT ACTTGAAAAA GGAAATTGCT 4 980 

GAATTTATAG GAGATCATGA AATCAAATCC ATTGGTGTCG CACATGCTAA CGTCATTGAA 5040 

45 TATGTTGATA AATTGAAGAA AGTTTTTAAT GAAGCTTTTC ATGTGAATAA TTACGATATA 5100 

AATGTAACTA CACCAGTTAT TTCTGCACAT ACTGGTCAAG GTGCGATTGG CCTCGTAGTC 5160 

CTTAAGAAGT AAATTTAATC TTTTCAGTGT TAATTACTTC CATTTCAATC CTTTATAGAC 5220 

SO 

TAAATTTATA ATTAGATAGA TAGAGGAGGT AATTCATATG ACAAAAGAAT ATGCAACATT 5280 

AGCAGGAGGA TGTTTCTGGT GCATGGTTAA ACCATTTACA TCATATCCAG GCATCAAGTC 5340 
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GAATCAAACC GGCCATGTCG AAGCAGTACA AATTACGTTT GATCCAGAGG TTACTTCCTT 5450 

TGAAAATATA TTAGACATAT ATTTCAAAAC ATTTGACCCA ACTGATGATC AAGGGCAATT 5520 

TTTCGATAGA GGCGAAAGCT ATCAACCAGT CATTTTCTAT CATGATGAAC ATCAGAAAAA 5580 

GGCTGCTGAG TTTAAAAAGC AACAATTAAA TGAACAAGGT ATTTTCAAGA AACCAGTGAT 5640 

TACACCTATT AAACCATATA AAAATTTCTA TCCAGCTGAA GACTACCATC AAGATTATTA 5700 

CAAAAAGAAC CCGGTACATT ATTACCAATA TCAACGTGGT TCAGGTAGAA AAGCGTTTAT 5760 

AGAATCACAT TGGGGGAATC AAAATGCTTA AAAAAGATAA AAGTGAACTA ACAGATATAG 5620 

75 AATATATTGT TACACAAGAn AACGGCACTG AACCACCATT TATGAATGAA TATTGGAATC 5880 

ATTTTGCTAA AGGATTTATG TAGATAAAnT TCnGGTAAAC CTTG 5924 
(2) INFORMATION FOR SEQ ID NO: 131: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9280 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

GGCCGTTnAA AATCTCCAAA ATAnAAAAAC CCATCTTGTT CCAATGTTTT AAAATCGCCa 60 

TCCaACACTT GaTCaATAGC TTGCAACAAC GTTGAACGTG TTTTaCCAAA AGCATCaAAC 120 

GCTCCCACTA AAATCAGTGC TTCAAGTAAC TTTCTCGTTT TGACTCTCTT CGGTATACGT 180 

CTAGCAAAAT CAAAGAAATC TTTAAATTTG CCGTTCTGAT AACGTTCATC AACAATCACT 24 0 

TTCACACTTT GATAACCAAC ACCTTTAATT GTACCAATTG ATAAATAAAT GCCTTCTTGG 300 

GAAGGTTTAT AAAACCAATG ACTTTCGTTA ATGTTCGGTG GCAATATAGT GATACCTTGT 360 

40 TTTTTTGCTT CTTCTATCAT TTGAGCAGTT TTCTTCTCAC TTCCAATAAC ATTACTTAAA 42 0 

ATATTTGCGT AAAAATAATT TGGATAATGG ACTTTTAAAA AGCTCATAAT GTATGCAATT 480 

TTAGAATAGC TGACAGCATG TGCTCTAGGA AAACCATAAT CAGCAAATTT CAGAATCAAA 54 0 

TCAAATATTT GCTTACTAAT GTCTTCGTGA TAACCATTTT GCTTTGCACC TTCTATAAAA 600 

TGTTGACGCT CACTTTCAAG AACAGCTCTA TTTTTTTTAC TCATTGCTCT TCTTAAAATA 66 0 

TCCGCTTCAC CATAACTGAA GTTTGCAAAT GTGCTCGCTA TTTGCATAAT TTGCTCTTGA 72 0 

TAAATAATAA CACCGTAAGT ATTTTTTAAT ATAGGTTCTA AATGCGGATG TAAATATTGA 780 

ACTTTGCTTG GATCATGTCT TCTTGTAATG TAAGTTGGAA TTTCTTCCAT TGGACCTGGT 840 
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ACACTTCTTA 


CACCGTCAGA 


Vl^lAAl 


Ti TV *P inVOf 7V f+ 

AA i ATUC CAG 


TCGTATCTCC 


TTGCGACAAC 


960 




AATTCAAACA 


CTTTTTGATC 


A a LAAAtJVnjA 


ATCTTTTCGA 


TATCAATATT 


AAT AC CTAAA 


1020 




tcltitttga 


CTTGTGTTAA 


C & T^l "TV*^ TV »T/'"» A 

\jJ\l 1 IVjAiUA 


A XAATCGATA 


AGTTTCTCAA 


C CCT AGAAAA 


1080 




TCTATTTTTA ATAACCCAAT 


ALVj I TCCjGCT 


T CAGT CATTG 


TCCATTGCGT 


T AAT AAT CCT 


1140 




GTATCCCCTT 


TCGTTAAAGG 


GGCATATTCA 


TATAATGGAT 


GGTCATTAAT 


AATAATTCCT 


1200 


10 


GCCGCATGTG 


TAGATGTATG 


TCTTGGTAAA 


CCTTCTAACT 


TTTTACAAAT 


ACTGAACCAG 


1260 




CGTTCATGTC 


GATGGTTTCG 


ATGTACAAAC 


TCTTTAAAAT 


CGTCAATTTG 


ATATGCTTCA 


1320 


15 


TCAAGTGTAA 


TTCCTAATTT 


ATGTGGGATT 


AAACTTGAAA 


TTTCATTTAA 


TGTAACTTCA 


13B0 


TCAAACCCCA TAATTCTTCC AACATCTCTA 


GCAACTGCTC 


TTGCAAGCAG 


ATGACCGAAA 


1440 




GTCACAATTC 


CAGATACATG 


TAGCTCGCCA 


TATTTTTCTT 


GGACGTACTG 


AATGACCCTT 


1500 


20 


TCTCGGCGTG 


TATCTTCAAA 


GTCAATATCA 


ATATCAGGCA 


TTGTTACACG 


TTCTGGGTTT 


1560 




AAAAAACGTT 


CAAATAATAG 


ATTGAATTTA ATAGGATCAA 


TCGTTGTAAT 


TCCCAATAAA 


1620 




TAACTGACCA 


GTGAGCCAGC 


TGAAGAACCA 


CGACCAGGAC 


CTACCATCAC 


ATCATTCGTT 


1680 


25 


TTCGCATAAT 


GGATTAAATC 


ACTTACTATT 


AAGAAATAAT 


CTTCAAAACC 


CATATTAGTA 


1740 




ATAACTTTAT 


ACTCATATTT 


CAATCGCTCT 


AAATAGACGT 


CATAATTAAG 


TTCTAATTTT 


1800 




TTCAATTGTG 


TAACTAAGAC 


ACGCCACAAA 


TATTTTTTAG 


CTGATTCATC 


ATTAGGTGTC 


I860 


30 


TCATATTGAG 


GAAGTAGAGA 


TTGATGATAT 


TTTAATTCTG 


CATCACACTT 


TTGAGCTATA 


1920 




ACATCAACCT 


GCGTTAAATA 


TTCTTGGTTA 


ATATCTAATT 


GATTAATTTC 


CTTTTCAGTT 


1980 




AAAAAATGTG 


CACCAAAATC 


TTCTTGATCA 


TGAATTAAGT 


CTAATTTTGT 


ATTGTCTCTA 


2040 


35 


ATAGCTGCTA ATGCAGAAAT 


CGTATCGGCA 


TCTTGACGTG 


TTTGGTAACA 


AACATtTTGA 


2100 




ATCC3AACAT 


GTTTTCTACC 


TTGAATCGAA 


ATACTAAGGT 


GGTCCATATA 


TGTGTCATTA 


2160 


40 


TGGGTTTCAA 


ACACTTGTAC 


AATATCACGA 


TGTTGATCAC 


CGAC1T1T1T 


AAAAATGATA 


2220 


ATCATATTGT 


TAGAAAATCG 


TTTTAATAAT 


TCAAACGACA 


CATGTTCTAA 


TGCATTCATT 


2280 




TTTATTTCCG 


ATGATAGTTG 


ATACAAATCT 


TTTAATCCAT 


CATTATTTTT 


AGCTAGAACA 


2340 


45 


ACTGTTTCGA 


CTGTATTTAA 


TCCATTTGTC 


ACATATATTG 


TCATACCAAA AATCGGTTTA 


2400 




ATGTTATTTG 


CTATACATGC 


ATCATAAAAT 


TTAGGAAAAC 


CATACAATAC 


ATTGGTGTCA 


2460 




GTTATGGCAA 


GTGCATCAAC 


ATTTTCAGAC ACAGCAAGTC TTACgGCATC TTCTATTTTT 


2520 


50 


AAGCTTGAAT 


TTAACAAATC 


ATAAGCCGTA 


TGAATATTTA 


AATATGCCAC 


CATGATTGAA 


2580 




TGGCCCCTTT 


CTATTAGTTA 


AGTTTTGTGC 


GTAAAGCTGT 


AGCAAGTTGC 


TCAAATTCAT 


2640 
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CAATATCATT AATAATCAAT TGCCCTTTAG AACGTAATCG ACATCTGATT TCATTACCTT 2760 

CATCGACTGC AAATACCCAT ATTTTCAAGC CTTTGATGTC AGCAATTGTA TTAACAAACT 2820 

GAGATGCTTC ATTTGGCTGA ATACCGAATT GCTCCAATAC ATCTTCAGTT ATTTTAACTT 2860 

GGCAGAATCC ATCATCCATA AGTTCGAAAT GTTGTAAAAC ATAACCTTGA AACGGCAACA 2940 

TTTTTGGGTC CTTCTCCATC ATTTTATTTA AAAGCGCATT ATGATCAATA TCATGCCCAA 3000 

TTAACTTTCC AGCAATTTCC ATAGTATGTT CTGAGGTATT GTTAAAAAGG AATCGCCCAG 3060 

TATCACCGAC GATACCAAGA TATAAAACGC TCGCGATATC TTTATTAACA ATTGCTTCAT 3120 

CATTAAAATG TGAGATTAAA TCGTAAATGA TTTCACTTGT AGATGACGCG TTCGTATTAA 3180 

CTAAATTAAT ATCACCATAC TGATCAACTG CAGGATGATG ATCTATTTTA ATAAGTTTAC 3240 

GACCTGTACT ATAACGTTCA TCGTCAATTC GTGGAGCATT GGCAGTATCA CATACAATTA 33 00 

20 CAAGCGCATC TTGATATGTT TTATCATCAA TGTTATCTAA CTCTCCAATA AAACTTAATG 33 60 

ATGATTCCGC TTCACCCACT GCAAATACTT GCTTTTGCGG AAATTTCTGC TGAATATAGT 3420 

ATTTTAAACC AAGTTGTGAA CCATATGCAT CAGGATCTGG TCTAACATGT CTGTGTATAA 3480 

25 TAATTGTATC GTTGTCTTCG ATACATTTCA TAATTTCATT CAAAGTACTA ATCATTTTCA 3540 

TACTCCCTTT TTTAGAAAAG TTGCTTAATT TAAGCATTAG TCTATATCAA AATATCTAAA 3600 

TTATAAAAAT TGTTACTACC ATATTAAACT ATTTGCCCGT TTTAATTATT TAGATATATA 3660 

TATTTTCATA CTATTTAGTT CAGGGGCCCC AACACAGAGA AATTGGACCC CTAATTTCTA 3720 

CAAACAATGC aAGTTGGGGT GGGGCCCCAA CGTTTGTGCG AAATCTATCT TATGCCTATT 3780 

TTCTCTGCTA AGTTCCTATA CTTCGTCAAA CATTTGGCAT ATCACGAGAG CGCTCGCTAC 384 0 

TTTGTCGTTT TGACTATGCA TGTTCACTTC TATTTTGGCG AAGTTTCTTC CGACGTCTAG 3900 

TATGCCAAAG CGCACTGTTA TATGTGATTC AATAGGTACT GTTTTAATAT ACACGATATT 3960 

TAAGTTCTCT ATCATGACAT TACCTTTTTT AAATTTACGC ATTTCATATT GTATTGTTTC 4 020 

TTCTATAATA CTTACAAATG CCGCTTTACT TACTGTTCCG TAATGATTGA TTAAAAGTGG 4080 

TGAAACTTCT ACTGTAATTC CATCTTGATT CATTGTTATA TATTTGGCGA TTTGATCGTT 4140 

45 AATTGTTTCA CCCATCTGAG GCTGTCTTCC TAAAAGTTGC ATAGACTTTA AAACATCTTG 4200 
TCTATTAATC ACACCCACTG TCTTTTTATT ACTCGAAACG ACAGGAATCA ATTCAATACC . 426 0 

TTCCCAAATC ATCATATGCG CACAACTTGC TACTGTACTC ATAGCATTTA CATAAATAGG 4320 

50 ATTTCGCGTC ATCACTTTAT CTATTTCGTC GTCGTCCTTT GTATTAATCA TCTCTCGACT 4380 

TGTTACAATA CCTACTAATT TATACGACTC ATTGACTACC GGAAATCTTG TATGGCCAGT 444 0 



30 



35 



40 



55 



712 



10 



15 



20 



EP0 786 519 A2 

ATCTAATGGC GTCATTATAT CTTGAACTAT TAAGATATCT TTTCGTATTT TCTGATTAAA 4560 

AAGTGCTTTG TTGATAATAT TTGCAACTAG GAATGTATCA TAACTTGATG ATAGAACAGG 4620 

TAAATCATGT TCATTCGCAA AATTAATAAC TTTATTAGAT GGCTTAAATC CACCAGTAAT 4680 

TAATATAGCC GTACCTCTTT TTAAAGCTTC AATCTGCACA TCTTCACGAT TTCCGACAAT 474 0 

CAATAATGTC TTTGGACCAA TATACTTTAA AATATCTTTG AGTTCCATTG CTCCAATTGC 4800 

AAATTTAGAT ACCATCTTAG TGATACCTTT GTTGCCACCT AACACTTGGC CATCAATAAT 4 860 

ATTGACAATT TCATTAAAAG TTAAATGTTC AATTTCATTA CGATTACGTT TTTCGATTCG 4920 

AACCGTACCA ACACGATCTA TCGTTGCGAC CATGCCCATT TTATCAGCAT CTTTmATTGc 4 980 

ACGATATGCT GTCCCytCaG ATACGTTTAA AAATTTAGCG ATTTTACGCA CCGAAATTTT 5040 

AGAGCCTATA GATAACGATT CAATATAATC TAAAATTTGT TCATGTTTTG TCATTCTTTA 5100 

CCTCTTCTTT TCGAACAGTA TTAACTACAT TATAACTTTA TTTTGGATAA AAAGCATTGA 5160 

AGTGAAATGA AATAATGATC GTTtCACCTA TTTTATTTTT TGAAAATATA CAACAAACAC 5220 

AAAGATCACA AAATCTTTAA TTTTAAATGG AAAAATCCAT TATTATTTAT TAGAATGTAA 5280 

2$ GTGAGGAGGG ATGTACTAAT GTATAAAAAT ATATTACTTG GTGTAGACAC TCAGTTAAAA 5340 

AATGAAAAAG CACTAAAAGA AGTGTCTAAA TTAGCTGGCG AAGGTACAGT CGTAACAGTT 5400 

TTAAACGCAA TCAGCGAACA AGaTGCTCAA GCATCAATTA AAGCAGGTGT TCATTTAAAC 5460 

30 AAACTTACTG AAGAACGAAG CAAGCGATTG GAAAAAACAC GCAAAGCTTT AGAAGATTAT 5520 

GGTATTGATT ATGACCAAAT AATTGTTCGT GGTAATGCAA AAGAAGAACT ATTAAAACAT 5580 

GCTAATAGCG GTAAATATGA AATTGTTGTT TTAAGTAACC GTAAAGCAGA AGACAAAAAG 564 0 

AAATTTGTAC TTGGAAGTGT CAGCCACAAA GTAGCAAAAC GTGCGACTAT CCCTGTATTA 5700 

ATCQJTAAAT AAAATTTTTA TCCAGAATCA CAAATAATCT TTCAATCATG ATGCAGTCTC 5760 

AAACGACTGA GTAAATACAA GAAACGATTA TGACTGTGGT TCTGGATTTT TTATATCGTA 5820 

GTAAATTTAT AATCAATGTC TAATTGTATA AAACTAAAAT TACGAGAGTA GGTCAGAAAT 5880 

GATAAAGAAC CACTGATGTC CCCCGTCCAC GTCGTAACTG AATCAGTAGA ATATAAAAAC 594 0 

ACCCACTAAA AATATGCAGA CGATAACTTC CACATAGATT AGCGAGGTGT TTTTTAGTGT 6000 

AAAATCTATA TTCTATTTAA AACTGAACAG ATTCACCTGG TTTTAAAATT TGCACGTCCC 6060 

CTACATTAAC AGCATCTTTA AATTGTTGTG GATCTTGTTC GATTAATGGG AATGTATCAT 6120 

AATGAATCGG TACAGAAATT TTTGGTTTAA TAAATTCATT AATAGCATAA CTTGCATCAT 6180 

CAATACCCAT CGTAAAATTA TCTCCAATTG GTACAAAACA TACATCAACT GGATGACGTT 624 0 
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TTCAACTTCA AACACGATAC CCATTGGCAT ACCTAAATAA ACTGGgAATA CCATTTTCAT 


6360 




GTGTAAAACT TGAACTATGA AATGCTTGAA CAAATTTAAC GCTTCCGAAA 


TCAAaGTTTG 


6420 


s 


CTTTACCACC AaTATTCATA CCATGAACAT TTTCAACACC GTGATATGAA GAAAGATAGT 


6480 




CAGCCATTTC TGCACTTCCA ATTACTGTTG 


CTCCTG'ITIT 


CTTTGCTAGT 




6540 


10 


CACCAAAATG ATCAAAATGA CCGTGCGTTA AAACGATATA GTCTACCTGC 


AC ly 1 1 1 CAA 


6600 


TATTCAAATC ACACTTAGGG TTATTTGAAA 


TAAACGGATC 


TACGATAACC 


TTTTTGTTGT 


6660 




TCCCTTCTAA ATAAATCGTT GATTGACCAT 


GAAATGATAA 


CTTCATTTGA 


GCATCCTCCT 


6720 


15 


ATCAATTACT ATATAAATTT AGTACCCTTT 


TGCCACTTAA 


TTATAACAAA 


TTCTCAAATT 


6780 


TTAAAAATTG AAAATCTAGT TAATGTATTA GCTCGATTTT 


GAAATCTAAT 


AATAATTGGC 


6840 




ATAAAATGGA AGTAATATTA TGTTGAGGAG TGTTTATAAA ATGACAAAAA 


TATCAAAAAT 


6900 


20 


AATAGACGAA TTGAACAATC AACAAGCTGA 


TGCAGCATGG 


ATTACAACAC 


CGTTGAATGT 


6960 




ATATTATTTT ACTGGATACC GTAGCGAACC 


CCATGAAAGA 


TTATTTGCAT 


TATTGATTAA 


7020 




GAAAGATGGT AAACAAGTAC TATTTTGTCC 


AAAAATGGAA 


GTCGAAGAAG 


TCAAAGCATC 


7080 


25 


ACCTTTCACA GGTGAAATCG TTGGATATTT 


AGACACTGAA 


AACCCTTTTT 


CACTTTATCC 


7140 




TCAAACAATC AATAAATTAC TAATTGAAAG 


CGAGCACTTA 


ACAGTAGCAC 


GCCAAAAACA 


7200 




ATTAATCTCT GGTTTCAATG TCAATTCATT 


CGGAGATGTT 


GATTTAACAA 


TCAAACAATT 


7260 


30 


GAGAAATATT AAATCCGAAG ATGAAATTAG 


CAAAATACGT AAAGCTGCTG AGTTAGCAGA 


7320 




TAAGTGTATC GAAATAGGTG TTTCTTATTT 


AAAAGAAGGT 


GTGACTGAAT 


GTGAAGTAGT 


7380 




CAACCATATT GAGCAAACTA TCAAACAATA 


TGGCGTCAAT 


GAAATGAGTT 


TTGATACGAT 


7440 


35 


GGTTTTATTT GGAGATCATG CCGCATCACC 


TCATGGCACA 


CCAGGAGATC 


GCAGATTAAA 


7500 




AAGCAATGAA TATGTACTAT TTGATTTAGG 


TGTAATTTAT 


GAGCATTATT 


GTAGCGATAT 


7560 


40 


GACACGTACT ATTAAATTTG GTGAACCTAG 


CAAAGAAGCA 


CAAGAAATTT 


ATAATATTGT 


7620 


ATTAGAAGCA GAAACATCTG CAATCCAAGC AATTAAACCT 


GGAATACCAT 


TAAAAGATAT 


7680 




CGATCATATC GCTAGAAATA TTATTTCAGA 


AAAAGGTTAT 


GGTGAATATT 


TCCCTCATCG 


7740 


45 


CTTAGGTCAT GGCCTAGGAT TACAAGAACA 


TGAATATCAA 


GATGTTTCAA 


GTACTAATTC 


7800 




TAATTTGTTA GAAGCTGGCA TGGTTATTAC 


AATCGAACCA 


GGTATTTATG 


TACCTGGTGT 


7860 




TGCAGGTGTA AGAATTGAAG ATGACATACT 


TGTCACTAAT 


GAAGGATATG 


AAGTATTAAC 


7920 


SO 


ACATTACGAA AAATAAGGAG TGGGATAAAA 


ATGAAAAGCT 


TGTTACAAGC 


GCATTCTCAT 


7980 




TCAGTCAAAC ACTGCCAATA TAACATTGTA 


GCGCCTAAGA 


CATAAATTTT 


TATCCAAGTC 


8040 
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TGTAATGAAT CAAATCAATA TCATTCATGT TCGATGATTT CTTCGCATTG TTTCTAGCTT 8160 

TAATTTATCA TTATTTAATT TTAATAACCA AGGAGATGAT AACGTCATTC TTTAGTACGC 8220 

5 TGTAATCCAT TCCCTTTTCA TCAAATTCAA ATTATAATTG TAATGCTTCT TCTACAGATT 82 80 

TATATTCCAT TTCAAATGCC TCTGCAACGC CTTTATTGGT TACGTGACCT TTGTAAGTAT 8340 

TTAAACCTAA TGATAATGGT TGATTTGATT TAAATGCTTC TCTATACCCT TTATTAGCTA 8400 

10 GCATGAGCGC ATAAGGTAGC GTAgCATTAT TTAAAGCTAA CGTCGAAGTA CGCGGTACTG 6460 

CACCTGGCAT ATTTGCAACT GCATAATGAA CCACACCATG CTTAATATAT GTAGGATCAT B520 

CATGTGTCGT AATTTTATCA GTTGtTTCAA AAATACCGCC TTGATCAATA GCAATGTCAA 8580 

15 

TAATAACTGA CCCATTTTTC ATTTGTTTAA TCATGTCTTC TGTTACAAGT CTTGGCGCTT 8640 

TAGCACCTGG AATTAAAACT GCACCTATTA CTAAATCACT TTGTTTAACA TACAACTCAA 8700 

TATTCAACGG ATTTGACATA ATTGTATGTA CACGTCCACC GAATAAATCA TCTAATTGTT 8760 

20 

GTAAACGCTT TGGATTAACA TCTAAAATCG TAACATCTGC ACCTAGTCCT AGTGCAATTT 8820 

TAGCTGCAlT TGTTCCTGCT TGACCACCAC CGATAATAGT TACTTTACCC TTAGGTACTC 8880 

25 CTGGGACACC ACCTAGTAGA ATTCCCATAC CACCATTAAG TTTTTGTAGG AACTCTGCGC 8940 

CAACTTGAGC TGACATTCTT CCTGCTACCT CACTCATTGG TGATAACAAT GGTAAAGATC 9000 

GGTCTGGTAA CTGCACAGTC TCATATGCAA TACTAATTAC TTTTCTATCT ATCAAAGCTT 9060 

30 GTGTTAATTT TTCTTCATTT GCTAAATGAa gatAaGTGAA TAATACAAGC CCTTCTTTAA 9120 

AATATGGATA TTCAGATTCA AGTGGTTCTT TAACTTTAAT AACCATATCC ACATCCCAAA 9180 

CTTTTGCTTG TTCAGCAACA ATCTCAGCAC CTGCTTCTTT GTAATCTACA TCTTCAAAGA 9240 

35 ATGATCCTGA ACCCGcATTT GTTTCCACTA AAACAGTATG 9280 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 4669 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

CTGATTAATC TCTTGTTGTC GTGTATTTAC TAATTGAATC GTTGGTGTCT GAACACGTCC 60 

50 CAGGGATAGC TGTGCATCAT ACTTTGTTGT TAGTGCACGC GTTGCATTAA TCCCAACAAT 120 

CCAATCTGCC TCACTTCTCG CTAACGCTGC ATAATACAAA TCGTTATATT GACGACCGTC 180 
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ACGGATTGGC TTTTTGTTAC CAACTTTATC CAAAATCAAT CTTGCAACTA 


GTTCACCTTC 


300 




TCGTCCaGCA TCTGTTGCAA TAATAATATC TTTCACTTTA TTATCTAAAA TTAACGCTTT 


360 


5 


TACTGTTTTA AATTGTTTGC TTGTTTTACC AATAACAACA GTTTTCATAT 


ATTTAGGTAT 


420 




AATTGGAAGG TCTTCTAATC GCCATTCCTT TAAATTTTTA TCGTATTGTT 


CAGGTGTCGC 


480 




ATTTGTCACT AGATGACCTA ACGCCCACGT GACAATATAT TGGTTATTTT 


CAAAGTAACC 


54 0 


10 


ATTACGCTTC TGATTTATTT GTAAAGCATC AGCAATATCT CTTGCGACTG 


ATGGTTTTTC 


.600 




AGCTAATATT AAAGATTTCA TAAATTATCC TTTCTCATAC GTTCTTTTAT 


TTCGAACGTG 


660 


15 


CTTCATCTAT TCCACTAATC TTTGATTTAA ATTCAATGAT TGCAAATGAT GTGTTAAATG 


720 


TATTGTAACA TGTTAATATC ACTATTAACT TTCATTTCAG TTGAAATACT 


ATATAATAAA 


7B0 




AGTAACAAAA AGTACGGAGG TAATGACATG AGCATAGTTC AGTTATATGA 


TATTACACAA 


840 


20 


ATAAAATCGT TCATTGAACA TTCGAATTAT GAATCAGCAT CATACTTATA 


TAAACTTCCT 


900 




CAACAGTACA ATGAAATAGA TGTATTAATA ACCGATGCGA TTGAATCACC 


TGGTGTATTT 


960 




TCGATTAAAG AAAACGATTC AATCAAAGCA ATCATATTGT CTTTTGCATA 


CGATAAAAAT 


1020 


25 


AAATTCAAAG TCATAGGCCC TTTCGTGGCT GACAATTATG TATTATCTGT 


CGATACGTTT 


1080 




GAAACGCTAT TTAAAGCAAT GACTTCGAAC CAACCTGACG ATGCCGTCTT 


TAACTTTTCT 


1140 




TTTGAAGAAG GCATTCAACA ATACAAACCA TTAATGAAAG TTATTCAAGC 


AAGTTATAAC 


1200 




TTCACTGACT ATTACATAGA AGCCCGTACA AGATTAGAAG AAGATATGCA 


CCAACCAAAT 


1260 




ATCATTCCTT ATCACAAAGG GTTTTATCGT GCTTTCAGCA AATTACACAC AACTACATTT 


1320 




AAATATCAGG CACAGTCACC ACAAGATATC ATTGATAGTT TAGACGACCA 


TCATCATTTG 


1380 


35 


TTTTTATTTG TTAGCGAAGG TTTACTTAAA GGTTATTTAT ACCTTGAAAT 


TGATTCACAA . 


1440 




CAGTCAATCG CCGAGATTAA ATACTTCAGT TCTCATGTAG ATTACCGTTT 


GAAAGGTATC 


1500 


40 


GCTTTCGAGT TGCTTGCGTA TGCATTGCAA TATGCTTTTG ATAATTTTGA 


TATTAGAAAA 


1560 


GTTTATTTTA AAATTCGTAA TAAAAATAAT AAACTCATCG AACGATTTAA 


TGGTCTAGGT 


1620 




TTCCATATCA ACTATGAGTA CATTAAATTC AAATTCGAAT CACGTAACGT 


AAAAGATCAA 


1680 


45 


ACAATCCCTG AATAAAACAC CAAGCAAATA CCCTACAGTA CATCATTAGC 


ATGTATTGTG 


1740 




GGTTTTTCTA CTTTTTGTAA ATATTGAAAA TTATAAGTAG TTGTTTTTTA 


CTATTAGGGC 


1800 




AGAATGCTTT ACAATAACAT GCAAGTGTCA ATTAAGGGGA GCACTTGCAT 


AAATAGTATA 


I860 


50 


GGAGAGTGAG TAGTCTTGCA ATTTCTTGAT TTCTTAATCG CACTTTTACC 


TGCTTTATTC 


1920 




TGGGGAAGTG TCGTTCTTAT .TAATGTGTTC GTCGGCGGTG GACCTTACAA 


CCAAATTCGT 


1980 
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TTCAATAATC 


CTACTGTAAT 


TATTGTCGGT 


CTTATTTCTG 


GTGCATTATG 


GGCGTTTGGA 


2100 




CAAGCGAATC 


AGCTTAAATC 


TATTAGTTTA ATCGGTGTAT 


CAAATACTAT 


GCCAGTTTCT 


2160 


5 


ACAGGTATGC AATTAGTTGG 


TACAACATTA 


TTCAGCGTTA 


TCTTTTTAGG 


TGAATGGTCT 


2220 




TCAATGACTC AAATTATCTT 


X wVJ X X i. i 


GCCATGATAT 


TATTAGTTAC 


TGGTGTAGCA 


2260 


10 






A AATG AAPGT 


CAATCAGATA ATCCTGAATT 


TAAAAAAGCA 


2340 


ATGGGTATTT 


1 AAI lulnlL. 




T IV WTfi/VtT 

lAivjliUjul 1 


i L-o 1 1 V? 1 nl*> 1 


TGGTGACATC 


2400 




TTTGGTGTTG 


GTL5GAACTGA 




TTCCAATCTG 


1 VJVj\j 1 AJ. Wat- 


AATTGGTGGC 


2460 


15 


TTTATCCTAT 


CCATGAATCA 




CTTAAATCAA 


LjVjVJAV- 1 1 nA 


TCTATTGCCA 


2520 




GGTGTGATTT 


GGGGAATTGG 


i AAt, 1 Ivj ill 


ATGTTCTATT 


C 1 CAAvJ L*AAA 


AGTTGGTGTA 


2580 




GCTACAAGTT 


TCTCATTATC 




GTTATCGTTT 


CAAt,V-I lAw 


CGGTATTTTC 


2640 


20 


ATTTTAGGAG 


AAAGAAAAGA 


TCCjTHjI. CAb 


ATGACGGG IA 


1 1 1\9<j<jI-Av>Vj 


TATTATTATT 


2700 




ATCGTGATAG 


CTGCTATAAT 


Ttn AIjCj 1 AAT 


TTGAAATAGA 


TV A/^TT A A 1T1 

AAu 1 1 AAA 1 A 


CTCATGTAAC 


2760 




GTAAAAATGT 


AATCACTTCT 


GAAAAI AACL 


ATTCACTTAT 


AvjAAlviAi 1A 


AAATTAATTT 


2820 


25 


TCGGGAATTT 


TACGTTGAAT 




ATGTCCTAGG 


AAATACGTGG 


CTCTAAAAAC 


2880 




AAAACGCAAT 


AACACATCAT 


GACATTAATC 


ATGCGTTTTA 


AGACTTTAAA 


ATTAGCGATA 


2940 




CTTTTAAAAT 


CTTGATGATA 


TTCATATATC 


AAGTATGCGC 


CATACATATG 


AAGTGGATAG 


3000 


30 


CTGCATAACG 


CACTGCATTA 


TCAACTTGAA 


TGTATGAGTT 


GAACAACTAT 


GTCATAAATA 


3060 




AAAGCCCCCT 


TTTCACAATA 


TACATTTACA 


TATTGTGGTA 


AAGGGGGCTC 


TCATTTTCTA 


3120 


35 


CGAATACTAA AATGGATTTT 


ATTTTCAAAT 


GTGTAAACTA 


GACAAACACT 


GCCTGATACA 


3180 


CGTACAAAAT 


AATGATACTA 


ATAATGATTG 


TCAAATTGGT 


CGTCATACCT 


ATAAATGGCA 


3240 




GTGTTCGATA TTTAAACTGA ATACCATAAG AAATAATTGC 


AACACcTACC 


GGGAACATCC 


3300 


40 


AAGTGACCAA 


CAATGTCGTC 


TTAATCATAT 


CATCTGATAC 


TGGTAACAAG 


ACATATACTA 


3360 




ACAATCCCGC 


AACTAATGCT 


AATCCATAAT 


GCAAACATAA ATATTTAATA 


GTAGCAGGTA 


3420 




TATACTTTCT 


TTCCAGAGTA AAATTCAACA TGACACCTAG 


CAAAATCATT 


GATAACGGCA 


3480 


45 


TATTTGCATG 


GGAAAGTATG 


CTAAAGAAAT 


CGATTGCCAC 


ATGTGGTAAA 


TGGATGTGAC 


3540 




TTATATTCAA 


TATAAACATT 


ACAATGTATG 


TAACGAGTGG 


CACTGATTGT 


AATAATTTCT 


3600 




TACCTAAATA 


TTTAAAATCG 


AATTGATCAC 


TACCTTCACT AAAGTAGCTA 


CCTACAAAGT 


3660 


50 


AAGTAATTCC 


AAACATCACA AAGGCACCAC 


CTATATCAGC 


CATAACAAAA 


TAAATAAGTC 


3720 




CCGTTTTAGG 


CCATATCACT 


TCAATTAGTG 


GATATGCAAA 


CAATCCAATA 


TTCATAGCAC 


3780 



55 
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CAATCATTTT 


CGCCACAATA CCATATATAA 


TCATTAAAAT 


TGGTAAAATG 


GAGAATGACA 


3900 




ATTTTAATTC 


TGCACTGTTT AAATTCACAA 


TAACTAAAGA 


TGGGAGTGTG ACATTAAGAA 


3960 


5 


CTAATGTAGC 


AATGACTTGA CTATCTGTTG 


CTTTTATAAA 


ATTAATGCGC 


TTCAAAAAGT 


4020 




AACCAAGCGC 


AATTAATAAA ATAATCATAG TAAATTGTTC 


TGTCACTGTT ATCCCTTCTT 


4080 




TCAATAATCT 


TCATAATTTA TAACTTTAAC ATACTCCACA 


GATATTTTAG AAGTCTACTG 


4140 


10 


TTTCATG CTA 


TAATCTACAT TAAATGCACT 


TAATTATATT 


TCAAAGGAGT 


GTTATAGTAT 


4200 




GTCTTTAGAA 


AACCAACTAG CCGAACTTAA ATATGATTAT 


GTTCGTCTTC 


AAGGTGACAT 


4260 


15 


AGAAAAACGG 


GAATCTTTGA ATTTAGATAC 


TTCCGCACTT 


GTTCGTCAAC 


TTAAAGATAT 


4320 


TGAAAATGAA 


ATTAGAAACG TTCGTGCTCA 


AATGCAAGAT 


TAATAATCTA 


TCATTCAAGC 


4380 




AATAAATGCT 


TTTTGTTACA TAAATTTGAC 


TAGCATTGCT 


CTGAATACGT 


TATATTGATG 


4440 


20 


AATTGCTTCA 


TTTTTCGCTC AATTACATCT 


AGAATCACAA 


GATGTTGTCG 


TGTTATGATT 


4500 




TAGTGTTTCA 


TTAACAACAT ACACGCATAT 


CTATCCCAAC 


ACTGCTATTT 


ATGTTTTCTA 


4560 




CGCTGnTGTA 


CTACATGAAC CCTTTGAAAC 


GGAGAGGAAG 


TTATCATATG 


CAATTTTAnC 


4620 


25 


TGATTTTACT 


AGCAATACTT TAACnAATTG 


nTAGTTTAAT 


AGAATTTTA 




4669 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2785 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 



40 



45 



SO 



TTTGCACCCA TCTGaTACAA TGCACCATGC GGTTTAACAT 


GATTAATTTT 


AACTTGATGA 


60 


ATGCGACAAA ACCCTTGTAA 


TGCACCTAAT 


TGATAAATCA 


TCAAATTATA 


AATCTCGTCG 


120 


TTAGAGATAT CTATATTTCG 


TCTGCCAAAG 


CCTTTCAAAT 


CAGGTAAACC 


AGGATGTGCA 


.180 


CCTACTGCAA CATTATGTGC 


TTTGGCAAGT 


TTTACCGTTT 


CATTCATTAC 


ATTTTCATCA 


240 


CCAGCGTGAA AACCACAAGC AACATTCGCA CTTGTAATTA 


ACGGAATAAT 


TTGATGATCA 


300 


CCACCAAAGG AATAATTTCC 


AAATGCTTCG 


CCTAAATCAC 


AATTCAAATC 


AACTCGCATT 


360 


ATAATTCCAC CCCTTTAACA 


ATTTGATGTT 


TTTCTAAAAA 


TTTAATATCA 


ACATCTTTTG 


420 


CATCTCCATC ACGATATAGT 


GGATAATTTA 


AAACTGCATA 


TAAAAAATCG 


GCAGTTGTAG 


480 


AAAATCCATC TATCACCATT 


TCATCTAAGG 


TGACTTTCAA 


CTTATCAATT 


GCTGAAGCTC 


540 
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AACCGTGATA 


TAGTAAAGAA 


TCGACTCGCA 


CATTAAAGCC 


TTGAGGTAAA 


TGTAACGCTG 


660 




TCACTTTACC 


TGGTGTTGGT 


TGAAATTTCT 


TTTCaGGATT 


TTCGGCATTT 


ATTCTCGCTT 


720 


c 
O 


CTATCACATG 


ACCATTAAAT 


TGAATATCGC 


TTTGTGAAAA 


AGGTAAATGA 


TTATGTTCCA 


780 




ATAAATACAG 


TTGTGCTGCA 


ACCAAATCAC 


GTTCTGCTCG 


CATCTCTGTA 


ACAGTATGTT 


B4 0 


10 


CAACTTGTAT 


TCGAGCATTC 


ATTTCAATAA 


AGTAATGTGC 


GGTATCAGTT 


ACTAAAAATT 


900 


CAATCGTACC 


TGCACTTCTA 


TAATTTGCTG 


CACGTGCAAC 


TTTAACAGCA 


TCGTTACATA 


960 




TTTGTTGTCG 


TCTTTCTTCA 


GTTAATGCTG 


CACAAGGAGA 


TTCTTCGATT 


AATTTTTGAT 


1020 


15 


TTTTACGTTG 


TACAGAACAA 


TCACGTTCCC 


CTAAATGTAC 


ATAATTATCC 


TGCCCATCTC 


1080 


CCaTAACTTG 


AACTTCAACA 


TGTTTTGcAA 


CAGGTATAAA 


AGCCTCAACA 


TAAACACGAT 


1140 




CATCATCAAA 


GTATriTTTT 


CCTTCACTTT 


TAGCTTCTTT 


AAATGCCTTT 


TCTAAATCTT 


1200 


20 


CAGCTTTCTT 


TACAATACGT 


ATACCTTTAC 


CACCACCGCC 


ACTGGCAGCT 


TTGATAACAA 


1260 




CTGGATAACC 


GATGTCTTTG 


GCAAGATTCT 


CAATTTCAGA 


CACATGATTC 


ACAGCACCAT 


1320 




TTGATCCTGG 


AATCACAGGA 


ACACCTGCAT 


GATGAACTGT 


TTGTCTTGCT 


GTTATTTTAT 


1380 


25 


CCCCCATCAT 


TTCCATCGTT 


TTTTTAGTAG 


GCCCTATAAA 


CGCTATGCCT 


TGTTCCTCAA 


1440 




CGGTTTGAGC 


AAATTTTGTT 


GATTCTGATA 


AAAAGCCATA 


TCCTGGGTGA 


ATTGCATTAG 


1500 




CACCAGTGAT 


TTGTGCAGCA 


GATATGATGC 


GGTCAATATT TAAATAACTA TCTAAAgCAT 


1560 


30 


TArcwTCCCC 


AATACATATA 


GCTTGATCTG 


CTAAATGTAC 


ATGCAAGCTT 


TGCTCGTCCC 


1620 




CTTTTGCATA 


AACTGCTACA 


GTTTCAATCC 


CATATTCTCT 


GCAAGCTCTT 


ATAATCCTTA 


1680 




CAGCAATTTC 


ACCTCTGTTC 


GCAATTAAAC 


AACGAAGCAT 


TTACTTACCC 


CCTTTACTTA 


1740 


35 


ATACGTACCA 


AAACTTGGTC 


GTATTCAACA 


TTTGTGCCAT 


GATCAGCTAC 


TATTTCAGTA 


1800 




ATrfCTCCAG 


CAACATCTGT 


TGTTACCTCG 


TTTAATACTT 


TCATCGCTTC 


AACATATCCT 


1860 


40 


ATAATATCTC 


CCTTGTTAAC 


TTTGTCACCG 


ACATTCACAA 


TTGGTTCAGT 


TAATTCTTTA 


1920 


CTATCTTGTA 


AAAAGAATGT 


ACCTATCATT 


GGTGATTTAA 


TGTCATGATA 


ATCATTTGTC 


1980 




GAAACATCGG 


AGTTATCATT 


CGCTTTTGAA 


GCTGTCAAAT 


CATTATTGTT 


CATACTTTGA 


2040 


45 


TTTGATTGAT 


TACTGTGTGC 


AGCCAAATGA 


TTCGAGTCAG 


TGAAGTCAAT 


TTCTATTTCA 


2100 




TCTTCAAAAT 


TTTTATATTT 


AAATTTCTTA 


ACATCATTTT 


CCTTCACTAA 


TTTGATTATT 


2160 




TGTTCGATTT 


nTTCAATATT 


CATTTTACAA 


ATCCCCTTTT 


AAAATTGTTG 


CTAATTTTTT 


2220 


50 


CGAAGTATGT 


CGCAAGCTAG 


ATGTATCAAA 


AATTGGAGTC 


TTTTGATGAC 


TCTTAAGAAT 


2280 




TTCATTAAAC 


AGAGACATTT 


GTTCCCGATT 


CTTATCTACA GCTTCTTGGA ATGATATCCA 


2340 
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TACAGTTGCA ATTTTGGTAT AACCACCTAT CGTTTGTTTA 


TCATTAAGCA 


GAATAATAGG 


2460 




TTGACCAtCA TTTGGTACCT GAACACTACC AAGAGCAACC 


GGTTCAGAAA 


TGATATCTGC 


2520 


5 


TTGATTAAtT GGTGCAACGC TGTCACCTTC CAAACGATAG 


CCCATACGGT 


CTGATTGTTC 


2580 




AGTAATTAAA TATGGATGAT TTACAATTTT CGCTCTAGCC 


TCTTCAGAAA 


ATGCCTCGAA 


2640 


10 


TTGAGGTCCT TGAAGAATGT GTATAATATT ATTTTCTGGC 


AATAAATCGT 


CCTGTAAATG 


2700 


AATCGTCTTT CCAATGTTTT CTTTAAAGTC ATTATTTATT 
AGCTAATAAC TTTCTACCTT TGAAT 


TTCACTGTTA 


TTACATCATC 


2760 
2785 


15 


(2) INFORMATION FOR SEQ ID NO: 134: 








20 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 










(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 






25 


AATGGAAACG GTTGAAACAG CAATTATTAC TATTTCTATG 


GGTGAAGGTA 


TTTCAGAGAT 


60 




ATTTAAATCA ATGGGTGCCA CA CAT AT CAT TAGTGGTGGA 


CAAACGATGA ATCCTTCTAC 


120 




AGAAGATATC GTTAAAGTCA TTGAACAATC AAAATGTAAA 


CGTGCAATTA 


TTTTACCGAA 


1B0 


30 


TAATAAAAAT ATCTTAATGG CAAGTGAACA AGCAGCGAGT 


ATTGTTGATG 


CAGAAGCTGT 


24 0 




TGTTATTCCA ACGAAATCTA TTCCTCAAGG TATAAGCGCA 


CTATTCCAAT 


ATGATGTGGA 


300 




CGCAACACTT GAAGaAAATA AAGCGCAAAT GGCTGATTCA 


GTAAATAACG 


TTAAATCTGG 


360 


35 


TTCATTAACG TACGCTGTTC GTGATACGAA AATTGATGGC 


GTTGAGATTA 


AAAAAGACGC 


420 




GTTTATGGGC TTGATTGAAG ATAAGATTGT AAGCAGCCAA 


AGTGATCAAT 


TAACAACGGT 


480 


40 


TACTGAGTTG TTAAATGAGA TGTTAGCAGA AGATAGTGAA 


ATATTGACTG 


TGATTATTGG 


540 


TCAAGATGCA GAGCAAGCAG TTACAGATAA CATGATAAAC 


TGGATCGAAG 


AGCAATATCC 


600 




AGATGTAGAA GTGGAAGTTC ATGAAGGTGG ACAACCAATT 


TATCAATATT 


TCTTTTCAGT 


660 


45 


AGAATAAAAA TTTAAAATAA AAAACTACCA ATGATAAATC 


ATCAGTTGGT 


AGTTTTTTAT 


720 




TTTGCTATTT TAGTGATATT GCGGGTTAAA AGTATCGTTC 


TCGAGTTGCT 


AACAATGTCA 


780 




TGTTCAACTT AGTCATGATA AAATAAATAA CATACTAAAT 


GATACGTAAA 


ATCAAATAAA 


840 


SO, 


ACATAGGTGA TTTATTTTGG CTAAAGTAAA CTTAATAGAA 


AGTCCATATT 


CTCTTTTACA 


900 




ATTAAAAGGT ATAGGTCCTA AGAAAATAGA AGTATTGCAA 


CAACTAAATA 


TTCATACAGT 


960 



55 
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(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 





TGTAGTTGAA 


CATGAACAAC AAAAGAAAGA AAAGACAAAA AAGCAATACA AGCCATTTTG 


60 


15 


GATTGTCATG 


AGTTTTATAA 


TACTTATAGT 


TGTACTATTA 


CTCCCGGCAC 


CTTCAAGTCT 


120 




GCCGATAATG 


GCTAAGGCAG 


TACTAGCTAT 


TTwAGCTTTT 


GCAGTTATTA 


TGTGGGTAAC 


180 




GGAAGCTGTA 


TCATATCCGG 


TGTCAGCAAC 


TTTAATTATT 


GGCTTAATGA 


TATTACTTTT 


240 


20 


AGGATTTAGC 


CCTGTTCAAA 


ATTTAGGGGA 


GAAGCTAGGT 


AATCCGAAAA 


GTGGCAGTGC 


300 




TATTTTAGCT 


GGAAGTGACC 


TTCTAGGAAC 


TAATCATGCA 


TTATCATTAG 


CGTTTAGTGG 


360 




ATTTGCAACT 


TCAGCTGTAG 


CTCTCGTTGC 


AGCTGCATTA 


TTTTTGGCTG 


CTGCTATGCA 


420 


25 


AGAAACGAAT 


TTGCATAAAA 


GACTAGCTCT 


TTTAGTGTTA 


TCAATTGTTG 


GTAATAAAAC 


480 




TAGAAATATA 


GTTATTGGAG 


CAATTATCGT 


TTCAATTGTA 


CTTGCATTTT 


TCGTTCCTTC 


540 




TGCAACAGCT 


AGAGCAGGGG 


CAGTTGTACC 


AATCTTG CTG 


GGTATGATTG 


CGGCATTTAA 


600 


30 


AGTTTCCAAA 


GATAGCAAGT 


TAG CGT CTTT 


ATTAATAATT 


ACTTCAGTAC 


AAGCTGTGTC 


660 




AATTTGGAAT 


ATTGGTATCA 


AAACGGCGGC 


AGCACAAAAT 


ATCGTAGCGA 


TTAATTTTAT 


720 


35 


AAACCATCAA 


TTAGGATTTG 


ATGTTTCATG 


GGG CGAGTGG 


TTCTTATATG 


CAGCGCCTTG 


780 


GTCCATAGTT 


ATGTCCGTAG 


CTTTATATTT 


CATCATGATT 


AAAGTGATGC 


CTCCAGAAAT 


840 




TAATfcCAATA 


GAAGGTGGTA 


AAGATTTAAT 


AAAAGAAGAA 


TTGCATAAAC 


TTGGCCCCGT 


900 


40 


TAGCCCACGT GAATGGCGTT 


TAATTGTTAT 


ATCGATGTTA 


TTATTACTGT 


TTTGGTCAAC 


960 


TGAAAAAGTA 


TTACATCCGA 


TTGACTCTGC 


ATCCATTACT 


ATTATTGCTT 


TAGGTGTTAT 


1020 




GTTAATGCCG 


AAAATTGGTG 


TCATGACATG 


GAAACATGTT 


GAAAATAAAA 


TACCATGGGG 


1080 


45 


AACAATTATC 


GTGTTTGGTG 


TAGGTATTTC 


ACTAGGTAAC 


GTTCTTTTGA 


AAACAGGTGC 


1140 




AGCTCAATGG 


TTAAGTGATC 


AAACTTTTGG 


TGTTTTAGGT 


TTAAAACATT 


TACCTATTAT 


1200 




CGCGACAATT 


GCACTTATCA 


CGCTTTTTAA 


TATATTGATT 


CATTTGGGCT 


TTGCGAGTGC 


1260 


SO 


AACAAGTTTA 


TCATCAGCGT 


TAATACCTGT 


TTTTATTTCG 


CTAACCTCTA 


CGTTACACTT 


1320 




AGGAGACCAG 


TCTATAGGAT 


TTGTTTTAAT 


TCAACAATTT 


GTTATTAGTT 


TTGGTTTCTT 


1380 



55 
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AGATTTCTTG AAGGCAGGTA TACCATTGAC AATTGTAGGG aATAtCtAgT GaTAGTTTTT X500 
AGCATGACTT ATTGGAAATG GGTAAGGTTG CnTTAATTAA 1540 
(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11823 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
ED) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

15 

ACTTCTCACA ATAAGAAATA TGAAATTGTT ATGTGTTAGT TGAGATTCAG TGATGAATTA 60 

CTTTTATCAT TTAAAATGTT GTTATCATTG TCATGCGTTA CCAAATCGCT TACGTATACA 120 

2Q CGATTCCCAA TCTTAACATA GACGATTTGT ATATCAGAAT TTTCTGATTA CTAACAGTTT 180 

ACCTAAGTTT AAATATCTGT TCAATGATTT TCAGTTATTT TTAAAAGAAA AATCGTAATG 240 

CTGCCATGAT AACAATCCCA CTAATAATTG TAATAGTTAA AtACGCGTGA TTATAGATAA 3 00 

2S AATAACCGTC GGAATGAGCG CGATAATGTA AGGGATGTTT AATGTATACC CCTCACCATG 360 

AGGCGTCTGT TGAATAATGC TGTCAATGAC AAGTGCCGTA AATAGTGTGA TTGGGATAAA 420 

TGATAGCCAT CGAACCACGA CATCAGGCAA TTGCACTTTT GAAATCATGA TAAAAGGTAT 4 80 

30 AATTCGAATT AATAGCGTTA CGATACCACA CAATAAAATA AGTATTAACA TGTTCATATG 54 0 

AGTTATCATT GTTCCATCAT CACTCCTAAC GCTGCTGAAA TTGTGGCTGC AATTAATATT 600 

GCTAGATATG AAGGCATAAA CATACTTAGC GATAACATCA TTACTATGAC GGCAATAATG 660 

35 

AGTACTATGT AAATTCTTAA TCGCGATTTA GTAATTGATT CAAATTGCGC AATGGCCAAA 720 

AAGATAAACA TAGCCGTGAT AGCAAAATCT AACCCTAGCG TTTGCGGATT TGAGATATAT 780 

TCGCCAAATA AAGCCCCAGC TACACATGAA ATTGCCCAAA ATAAATATGC TGTGATGTTA 840 

40 

AGACCATGCA TCCAACGATC ATTGATAGCT TCTCCTTTTA AATAAGGTGT AATGGCGACG 900 

CCAAACGTTT CGTCAGTTAC TAATGAACCT AATCCAACAC GGTTCCAAAA CCCATATGTC 960 

TTGAAGTTTG GTGCAAGCGA CATACTTAAA AGGAACATTC TTGAATTTAC GATAAATACA 1020 

45 

GTTAGTACAA TCGCTGATAT AGGTGTACCT GCTATAAACA ACGCGCACAT AATAAATTGC 1080 

GCAgcaCCGG CATATATAAC AAGACATAAC AAGACAATTT CTAAAATACT AAAGTTTTGA 1140 

50 GACGAAGCCA CAATACCAAA TGAAATACCA ACACCGGCAT AACCCAATAA TGTTGGGATA 1200 

CACTCTTGCA CGCCTTGTCT AAAACTTAAA TGTGTTGTCA TCTCAATTAC CTCCTTTGCC 1260 

55 



722 



EP0 786 519 A2 

TAAGCAATAA CATTAGACAT CAGTTTGTCT GAGGTTAGAC ATTCCGGAGT CTTTAGTCAG 1380 

CTTCATATTA ACTTTTTATT TTTGAGAATT TTCAATTTT T TATTTAAGAC TACCTCCATA 144 0 

5 TTTTCTATGG aTTTGTAGTT GTTTTTAAGT ATCAATTTTA TAAATTTTTA TATCTGATGA 1500 

TGAGTCTGGG aTATTGaTTC ATGTACCACT CCCTTaTaAT CATCCCCTCC CCCTaCCCTA 1560 

CTCCATCGAT ATAACTCATA CTACATATCA ACGAAATCAG TATTTTATCG CTTCCTTTCC 1620 

10 TATATTAGTG ATGCTCAAAC TTGTTACGTT TTAGATTGTT TTAGTTCATC ATAATTATCC 1680 

CGTATTGTTG CTATAATGAA ATGCGTTCAC CCCATTAAAC CACAAACTTA ATTTATTGTT 1740 

GTTATGTGCA TTGGCTCACT ATTATATTTT TACAGCACAA AAAAAGTGGC GACAGTTCGT 1800 

15 

CACCACTTTT TAAAATATTA TTTAAAGTAT CTTGCCCTTG CTTTAAGTAT ACGTAGATAT I960 

ATACTTTTTA AAGCTTGTAG CTAAAGCCTT TATTTAACTG GTTTTGAAAT TTGTGTTTTA 1920 

2o CCACCCATAA ATGGTACTAA TGCTTCTGGA ATTGTTACTG TTCCATCTTC ATTTTGGTAA 1980 

TTTTCAACAA TAGCAGCAAA TGTACGTCCA ACTGCTAAAC CACTACCATT TAATGTATGT 2040 

GCTAATTCTG GTTTAGCTGC TTTGTCACGC TTGAAGCGGA TGTTAGCACG ACGCGCTTGG 2100 

25 AAATCCGTAC AGTTTGAGCA TGAACTAATT TCTTTATAAT CATTGTAGCT TGGTAACCAA 2160 

ACTTCTAAAT CATATGTTTT GCTTGCACTA AATCCAATAT CACCTGTACA TAAAATAACA 2220 

CGACGGTATG GTAAAGCTAA CTCTTCTAGA ATTGCTTCTG CGTTTGTTGT CATTTCTTCT 2280 

30 AAAGCATTCC ATGAATCTTC AGGTTGTTCA AAACGTACCA TTTCCACTTT ATCGAATTGA 234 0 

TGTAAACGAA TTAATCCTCT TGTATCTCTA CCTGCTGATC CTGCTTCACT ACGGAAACAT 24 00 

GCAGATTGAC CAGTGAATTT TTCAGGAAGT ACACCTGGTT GAATAATTTC ATTACGGTAG 2460 

AAATTCGTTA ATGGTACTTC AGCAGTTGGA ATTGTATATA ATCCTTCTTT TTCTACTTTA 2520 

AATA^ATCTT CTTCAAATTT AGGTAATTGA CCTGTACCAT ACATTGTATC TGCGTTCACA 2580 

AGCTGTGGTA CCATCATTTC TGTATAACCA TGTTGTGTTG TATGTTTTGT AATCATATAG 264 0 

40 

TTCATTAAAG CACGCTCTAA TTGCGCACCT TCATTTGTTA AATATACAAA ACGCGCACCT 2700 

GAAACTTTTG CTGCACGATC AAAATCAGCC ATTTTCAATT CTTCTACAAT ATCCCAATGT 2760 

GCTTTGGGTT CAAATGAAAA CTCaCGTGGT GTACCCCACT TTTTAACTTC AACGTTATCT 2820 

45 

TCATCAGATT CACCTTGAGG TACATCATCA CTTATTAAAT TTGGAATACG ACAAAGGATA 2880 

CCTGTCATTT TATTATCAAT TTCATTTAAT TGACTATCTT TTTCTTTAAT ATCGTCACCT 2940 

so AATGTGCGCA TTTCAGCAAT CACATCATCA GCATTTTCTT TATTACGTTT TTTTAATGCG 3000 

ATTTCTTCGC TTACTTTATT ACGACGTGCT TTCATTTCTT CTGTTGCACT AATTAATTTA 3060 
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TCAATTTTGC TCTTAACTGT GTCAGGCTCA 
CTTCATCCTT TCCCAAATAA TTATCATTTA 

5 

GAAAATAAAA AAAGACCACA TCCCTACAAG 
AACAATTTAA GTTATAAAGA TACACTAAAC 
TCACCGATTG TTCTTTTAAA TTAAGTAGGT 

10 

ACTAACCACA AGCTCTCTGA TATCGAACAC 
TATTAAGTTA TTTTTAATAT AGCAAACTAT 
TCACTCATGT CGATTTAGTG ACATGCAGTC 

15 

GCAACCGCTT CAATATCTTT ATGAAATTGA 
CGAAACTCAT CATACTTG CG ACGTGTTTTT 

20 GCTGCTTGTA ATGCAATAAC ACATTCGATT 

TGATAACCAT GTCTAGCAGC TGTAGTTCCC 
GAAGTGATAG AATCAACACT CGCTGGATGC 

25 GCAGCAGCAT ATTGCATAAT CATCGCGCCA 

GCTGGTAAAT CACCATTTAA TTGAGGATTT 
TTTGCTAATT CACTTACACC TAATTTAAGA 

30 TGGAAGTTAC CACCTGAAAT AACAAACGTT 

TCATTAGCCG CATTCATTTC AAATTCTAAT 

CTCGCGCCAT GGATTTGTGG TATACAACGC 

35 ' 

GATTGTCGCG TCGTTAATGT TGATCCTTCT 

ATCTGTTCTT GAAAATTACG AACTGCGTGC 

TTAAGAGACT GATGCGTTAA TGCAGCAATC 

40 

TCTATATAAC TAATGACACC TTGAGCTGTC 
CCTTCTTTAG CCTGAAGGTT CAAAGGTTGT 
TCCTTTTCTT CCCCTCTGTA CAATACTTTC 

45 

GATAATGGCG CTAAATCTCC TGATGCACCG 
ATACGTTCAT TTATAAAAAA TTGTAATTGT 
SO CCTTTTAATA ATGTATTCAA TCGTAAAATC 

GGCTCACCTA GTCCACAGGC ATGTGAGCGT 



TTTCTGAATA ATCTAATGTC TAACATTAAC 31 BO 

TTATGGAATG ACGTACGTCT TTATTTTTTA 3240 

GGACGTGGTC TACGCGTTGC CACCCTATTT 3300 

CTAAATTGCA CTTCACTAAA ATAACGGTTA 3360 

AGATTCATAT ATATGTTGAT TCTTGTTCAC 3420 

TATATATTAC TTGTCCTACG AACAATGTCT 3480 

ATTTGCTTTT TCAAGTAACG ATTTCAAACA 3540 

GTTTGATAAA TTGATTGCTT TAAATACTGT 3600 

CGATCATGTG TAATGGATGG CACGATACTT 3660 

GGTGATAATC CTTCAACACC TTTTAACTCT 3720 

GCCAGCACAC GTCTTGCATT TTCAATAATT 3780 

ATAGATACGT GATCTTCTTG GTTCGCAGAT 3 84 0 

GCTAAAGTTT TATTTTCAGA AACGAGACTT 3900 

CTTTGCAATC CTGGCTCTGG ACTAAGAAAT 3960 

ACTAGTCGCT CTAGACGACG TTCCGATACG 4020 

TGATCTAATG CAAAAGCAAT AGGTTGTCCA 4 080 

TCATTTGCTT CCTCAAATAT AAGTGGATTA 4140 

TGCTGTTTAA CATAATTGAA TACTTGAAAA 4200 

AACGTATATG CATCTTGTAC ACGTATTTCT 42 60 

AACCAATCAC GCATACGCGC TGCCACATTA 4320 

ACATCATGTC GATATGCATC TATAATGCCA 4380 

CATTCAGATT GGTAACCTAA ATCTTCTGCT 444 0 

ATAGCTTGCG TACCATTAAT CAATGCTAAA 4 500 

CTATTTAATT CTCTTAATAC ATCGTCACTA 4 560 

CCTTCACCAA TTAATGCTAA TGCTAAATGT 4620 

AGAGAGCCTT GCTGTGGGAT TATCGGTATA 4680 

CTCACTAATT CTAAAGTGGC ACCTGAATGA 4740 

ATCATGACTA ATGCTACTTC TTTTGAAAAT 4 800 

ATCAGATTCA CTTGTAATTC ATTATATTGC 4860 
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TCCTCATTTT CAATAATACG TTCAACTACC GCTCTACTTT TTTTGACACG TTCTAACGCA 4 980 

TCATCAATAA TTTCAATCTT TGATTGTTGT TGTAAAAATG ATTTAATATC CTCAATTGTT 5040 

5 AGTGTTTCAC CATCTAAATA TAAAGTCATA TATGTTACCC CCTTGTTTAT ATTAAGTAAC 5100 

CCATCCTTCT TGAAGTATAC GTTTTCATTT TTATTGAAAC AATGGTTTTA CGTACATTTA 5160 

TAACCTATTA TCAGAGCACT ATTGTAGTGC GTTAAAGGAT ATTAAGATTG TTGTAAGCAT 5220 

10 

ATTTAATAAT TTATCTATTG ACGAATTGCA TATACAGGTA TAGTATTTTC TATTGTATTT 5280 

AACGACAAAT AATAATGAAT TCAGAAATTT ATAATACATT TTGTTAAAAG TTACTATATA 5340 

TTTTTAAAAT TGAATAAATT CGGAAAAGGC TTTTACATGG GAGGTTATAT CACTATGGAA 5400 

75 

ACGTTAAATT CTATTAACAT TCCTAAGCGT AAAGAAGATT CACATAAAGG TGATTATGGC 5460 

AAAATTTTAT TAATTGGTGG ATCTGCTAAC TTAGGTGGTG CCATTATGTT AGCGGCTCGT 5520 

2Q GCATGTGTAT TTAGCGGTAG TGGTTTAATC ACTGTAGCTA CACATCCAAC AAATCATTCA 5580 

GCATTACATT CTCGTTGCCC AGAAGCGATG GTTATTGATA TTAATGATAC GAAAATGTTG 564 0 

ACGAAAATGA TTGAAATGAC TGACAGTATA CTAATTGGTC CAGGTCTTGG CGTTGATTTC 5700 

2$ AAAGGAAATA ATGCCATTAC ATTCCTACTA CAAAATATAC AACCGCATCA AAATTTAATC 5760 

GTAGACGGCG ATGCGATTAC AATCTTTAGT AAACTGAAAC CGCAATTACC TACATGTCGT 5820 

GTGATCTTTA CACCACACCT CAAAGAATGG GAACGATTAA GTGGTATTCC TATTGAGGAA 5880 

30 CAGACATATG AGCGTAATCG TGAAGCAGTT GATCGTTTAG GTGCAACTGT TGTACTTAAA 5940 

AAACATGGTA CTGAAATTTT CTTTAAAGAT GAAGACTTTA AATTGACAAT CGGTAGCCCA 6000 

GCAATGGCGA CTGGTGGTAT GGGCGATACA CTTGCTGGTA TGATTACAAG CTTTGTCGGT 6060 

35 CAATTTGATA ACTTAAAAGA AGCGGTTATG AGTGCCACAT ATACACATAG TTTTATTGGC 6120 

GAAASCCTTG CAAAAGATAT GTATGTGGTG CCACCATCAA GACTTATCAA TGAAATACCT 6180 

TACGCAATGA AACAATTAGA AAGTTAGTCA TTACTAATCA TTGAATATAG TAAAGCATTA 624 0 

40 

CTTTCTAGCA TAAAAATAAG ACTCCCCTAC ATATAGGGAA GTCTTATTTT TTATTATTCT 6300 

TCATCTGATG ATTGTTGTAT ATCTTCTTCA ACACGATCCA TGAAATCTTG TCTTACTTCA 6360 

ATACGTCCAT CTTCATCATT TTCTTCTGAA TCAATCACTT CAGTATGAAT TGCATTTCCT 6420 

45 

GGTGTTTCAT CATTTaCAAC CGCTTCACGT TGTTGTTCAG TACCATCTTC AGATACAGTT 6480 

GAAGTAGATT GCTCATCTTC ATTCGTTTCA TCTTCTGCAT CTTCTTTTAC TTTAGCAACC 654 0 

GTTGAAACAA ATTGATCATC ACCTAAGCGA ATTAAGCGAA CACCTTGTGC TGCACGACCA - 6600 

50 

TTTTGAGAAA TATCTGCAAC ATCTAGTCGA ATAATGACAC CTGCATTAGT AACAATCATT 6660 
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10 



15 



20 



GTAGCTGTTT TAATACCTTT ACCACCACGA TTTGATAAGC GATAGTCATT AACTGGCGTA 6780 

CGTTTACCAT AACCATTTTC AGTAACTACT AATACTTCAT CAACACTGTT TGCATGAGCT 6840 

ACATCAAGCC CTACAACTTC GTCACCTTCA CGAAGTGTAA TACCTTTCAC ACCCGTTGCT 6900 

GTACGGCCTA AAGGACGTAA TGTTGATTCA GGGAATCGAA TTAATGATGC ATGTGATGTA 6960 

CCAATCAAGA TATCTTCTTG ACCACTTGTT AAGCGAACTG CAATTAACTC ATCATCTTCT 7020 

CTGAACGAAA TCGCAATCTT ACCATTTCTA TTTATTCTTG AGAAGTTACT TAATGCTGAA 7080 

CGTTTAACGA CACCACGTTT AGTTGCAAAC ACTAAGAAGT TGTCTTCACT TTCAAGGTCT 7140 

TTAACAGCAA TCATTGTACT AATGACTTCA TCATTTTCAA GTTCAATAGC ATTCACTACA 7200 

GGAATACCTT TAGACTGTCT TGATAACTCA GGCACTTCGT AACCTTTAAG TTTGTATACA 7260 

CGACCTTTGT TAGTAAAGAA CAATACATGG TCATGTGTAC TTAAAGTTAC CAATTGACTG 7320 

ACAAAATCTT CTTCCAATGT ATTCATACCT TGAACACCAC GACCACCACG GTTTTGAGCA 7380 

CGATATGTAG ATACCGGCAA ACGTTTAATG TAGTTATTAT GGCTTAGTGT AATTACTATT 7440 

TGTTCTTCTG GAATTAAGTC TTCGTCCTCT AAGTCTTCAA ATCCACCTAA TTGAATTTCT 7500 

25 GTACGACGAT CATCACCGAA ACGATCTCTA ATTTCAGTCA ATTCATCTCT AACTAACTGT 7560 

AATAACACTT CTTCATCAGC TAAGATTGCT TCTAATTCAC TAATATAATT TAATAACTCA 7620 

TTATATTCAG CTTCAATTTT GTCTCTCTCT AAACCTGTTA GACGTCTTAA ACGCATGTCT 7680 

30 AAAATAGCTT GAGCTTGTTT TTCAGAAAGT TTGAAGCGTT GTTGCAAGCT TTCCATTGCA 774 0 

ACTTTATCTG TATCTGACTC ACGAATCGTT GAAATAATTT CATCGATATG GTCAAGTGCG 7800 

ATACGTAATC CTTCTAAAAT GTGGGCACGA TCTTTAGCTT TACGTAAgTT GTATTGCGTA ■ 7860 

CGTCTTCTAA CAACTGTCTT TTGATGCTCT AAATAATGTA CCAACGCTTC TTTTAAATTA 7920 

ATAAGCTTCG GTCTACCATT TACAAGTGCA ATCATATTCA CACCAAATGA TGTTTGAAGA 7980 

GGTGTTTGTT TGTATAAGTT ATTTAAAATG ACACTAGCAT TTGCATCCTT ACGCACATCA 8040 

ATAACGACAC GCACACCAGT ACGTAAACTT GTTTC^TCAC GTAAATCAGT GATACCGTCA 8100 

ATTTTCTTGT CACGAACGAG CTCTGCAATT TTTTCAATCA TACGAGCCTT ATTCACTTGG B160 

AAAGGAATTT CAGTGACAAC AATACGTTGA CGTCCGCCTC CACGTTCTTC AATAACTGCA 8220 

CGAGAACGCA TTTGAATTGA ACCACGACCT GTTTCATATG CACGTCTAAT ACCACTCTTA 8280 

CCTAAAATAA GTCCAGCAGT TGGGAAATCA GGACCTTCAA TATCCTCCAT TAACTCAGCA 8340 

ATTGAAATAT CAGGGTTCTT ACTTAAGCTA AGTACACCAT TGATTAATTC TGTTAAGTTA 8400 

TGTGGTGGAA TATTCGTTGC CATACCTACC GCGATACCTG ATGCACCATT GGCTAATAAG 8460 
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AAATCTATTG TATCTTTATT AATATCACGT 
GCTTCAGTAT AACGCATTGC TGCTGCGCCA 
5 CCATCAACAA GCGGATAACG ATAACTGAAA 

ATAGATGAGT CACCATGAGG GTGATATTTA 
TTTTTATATG ATTTATCCGG TGTCATACCT 

JO 

TGTACTGGTT TTAAACCGTC ACGAACATCT 
GCATAATCTA AAAATGATTC ACGCATTTCA 
TGAGGTAATT CAGCCATCAA GAGTTCCTCC 

15 

TCTAAGTTTG CATAAACTGC ATTATCTTCT 
CCCATTAACA TTTCAAATGT TTGGTCCGCT 
AGAGCGCGGT GCTCAGGGTT CATTGTTGTT 

20 

AGACCTTTGT ATCGTGCAAT AGACCATTTT 
TCAAGTTCCC TATCATTGTA TACATAATAC 

25 GGTGGCTGTG CAATATACAC ATAGCCTGCT 
AATGTTAATA ACAATGTTCT AATATGCGCT 
ATTTTGTGAT ATCTTGCTTT CGCTAGATCA 

30 GTGATCATTT GACGAATTTC ATTGTTATTC 

TTTAATATCT TACCTCGTAA TGGTAAAATC 
GTAGACCCCC CGGCAGAGTC CCCTTCGACT 

35 CTAGAGCAAT CGGCTAATTT ACCTGGAAGG 
GTTACTTCAC GCGCTTTTTT CGCAGCAACA 
ACCACTGTAC GTGCGACTTG TGGATTTTCA 

40 AATTTATCTA CAACTTGACG CACTTCAGAA 
AATTGAGGAT CACCATGTTT GATAGATATA 
CCAGAAAGTC TATCTTTTTC TTCTTTCATA 

45 

ACACGCGTTA ATGCACGTTT GAATCCGTCT 
TTATTTGCGT AAGTTAAAAG ATTTGTGGCA 
ACTTCAATAT CATCTTTAGA TTGATGAATA 

50 

TTTTCGTTCA ATAACTCAAC GTACGATTTA 

55 



AACAGTTCAA GTGTGATTTT AGTCATACGC 


6580 


TCTCCATCCA 


TTGAACCAAA GTTACCTTGG 


8640 


TCTTGAGCCA 


TACGTACCAT TGCTTCATAA 


8700 


CCCATTACGT 


CACCAACGAT ACGTGCTGAT 


8760 


TGTTCATTTA ATCCATATAG TATACGACGA 


8820 


GGCAATGCAC 


GAGCAACGAT AACACTCATC 


8880 


CTGGTAATAT 


TTCGTTCATT 


TATTCTTGAT 


8940 


TTCAAAAGTT 


CAGTTCACAG 


CGCTTAGAAG 


9000 


ATAAATTGTC 


TACGGTTTTC 


TACAACGTCA 


9060 


TCAATCGCAT 


CTTCAAGTTT 


TACTTGTAAA 


9120 


TCCCAtAATT 


GATCTGCATT 


CATTTCTCCA 


9180 


GGTGTTGGAT 


TCAATTCAGA 


TTTAAGTTTA 


9240 


TTTTGTTTAC 


CTTGTGTCAG 


TTTATACAAC 


9300 


TCAATTAACG 


GTCTCATAAA 


TCGATAGAAG 


9360 


CCATCCACAT 


CGGCATCAGT 


CATAATGACG 


9420 


AAGTCGCCAC 


CGATTCCTGT 


ACCAAATGCT 


9480 


AAAATTCTAT 


CTAATCGTGC 


TTTTTCAACA 


9540 


GCCTGCGTTC 


TAGAGTCACG 


ACCAGATTTT 


9600 


AAGAAAATCT 


CACATTCTTC 


AGGACTTTTA 


9660 


CTTGCTACAT 


CTAACGCTGA 


TTTACGACGT 


9720 


CGTGCACGTG 


CCGCCATAAT 


ACCTTTTTCA 


9780 


TATAAAAATC 


GTTCAAAGTG 


CTCTGAGAAT 


9840 


TTACCTAATT 


TTGTCTTCGT 


TTGACCTTCG 


9900 


ATTGCTGTCA 


TACCTTCACG 


TGTATCTTCA 


9960 


ATCTTGCTAC 


TTAAACCATA ACTATTTAAG 


10020 


TCATGCGTAC 


CACCTTCATA 


CGTATGAATG 


10080 


TATCCTGAGT 


TATATTGAAT 


CGCAATTTCT 


10140 


TAAATTGGCT 


CATCATGAAT AGGTTCTTTA 


10200 


ATACCGCCCT 


CATAGTGATA 


GGAGTCTTCT 


10260 
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GCAAGCTCTC TAATACGCTG CTGTAATGTT TCATAGTTGT ATACAGTTGT CTCTGTGAAG 10380 

ATTTCTCCAT CTGCTTTAAA ACGAAtGaCA GTACCTGTCT TAtCAGTnGT GCCAACTTCT 10440 

TTTAAGTCAA ATTGAGGTAC ACCTTTTTTA TATGCTTGAT GATATATAGT CTCATTTCTG 10500 

TGTACATATA CTTCTAAGTC TTGTGACAAT GCGTTTACAA CTGATGAACC AACACCATGT 10560 

AAACCACCAG ATACTTTGTA TCCGCCACCG CCAAATTTAC CACCAGCATG TAAAACAGTT 10620 
AAAATAACTT CGACAGCTGG ACGTCCCATT TTTTCTTGAA TATCAACTGG GATACCACGT , 10680 

CCGTTATCCG TTACTTTAAT CCAGTTATCT TTTTCAATAA CAACTTCAAT TTGATTTGCA .10740 

TAACCaGCTA ATGCTTCATC GATACTATTA TCGACAATTT CCCACACTAA ATGGTGCAAA 10800 

CCTCTCTCTG AAGTCGATCC TATATACATA CCTGGTCTTT TACGTACTGC TTCTAAACCT 10860 

TCTAATACTT GTATTTGCCC AGCACCATAA TTATCCGTGT TGTTTACATC TGACAATGCA 10920 

GTCACCATCG CTTTCTGTTA CTTTATAATT TCACCTTGAT TAATACGATA CAATTTAGCG 10980 

TTATTCATGA TTTCATGATC AATACCATCT ACAGATGTCG TAGTGACAAA TGTTTGTACT 11040 

TTATGCTGAA TCGTACTTAA TAAATGCGTT TGACGCGAAT CATCTAATTC ACTGAGTACA 11100 

TCGTCTAATA ATAAGATGGG ATATTCCCCA ACTTCGATAT TCATTAACTC AATTTCAGCT 11160 

AATTTAATGG ACAAAGCCGT TGTACGTTGC TGTCCTTGAG AACCATATGT TTGAGCATCC 11220 

ATGCCATTCA CATCAAAACT TATATCATCT CGATGTGGTC CGAATAAGCT AATGCCTCGT 11280 

TCTTTTTCTC TTTGCATATT ATCGCTAAGA ATAGACATAA TTTCTTCAAG TCGTGCCGCT 11340 

TCATTTTGAG CATAATCAAA TTTAAGACTA GGTAAATAAT TCAGCGACAA CGCTTCTTTA 114 00 

TCATTTGTGA TACCAGCATG AATCGGTTTA GCTAACGACT CTAGCTCTTG AATAAAATGT 114 60 

GCACGTTTAT CAGTTACTTT CATTGCATAT TCAGCAAACT GCTGATTTAA TACTTCCAAC 11520 

ATTGTTAAGT CCTTTTTTTG GCCTAATTGT AACTGCTTTA AGTAATTATT CTTTTGCTTT 11580 

AAAATACGTT GGTATTGAGC TAAATCATTT AAGTAAACAG CAGAAATTTG GCCCAACTCC 11640 

ATATCTATAA AGCGTCGTCT TATTtGrGGr GAGCCTTTTA CAATATTCAA ATCTTCTGGC 11700 

GCAAATAGAA CCACATTGAG GTGTCCAATA TATTGAGTTA GACGACTTTG CTCTAAGTGn 11760 

ATTCACTTTG GACTTGTTTA CCTTTnTTAG TTATAAACAT TGTTAATGGG CATCGTGCCG 11820 

TCT 11823 
(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 692 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

ATAATTATTA ACATGGTGTG TTTAGAAGTT ATCCACGGCT GTTATTTTTG TGTATAACTT 60 

AAAAATTTAA GAAAGATGGA GTAAATTTAT GTCGGAAAAA GAAATTTGGG AAAAAGTGCT 120 

TGAAATTGCT CAAGAAAAAT TATCAGCTGT AAGTTACTCA ACTTTCCTAA AAGATACTGA 180 

GCTTTACACG ATTAAAGATG GTGAAGCTAT CGTATTATCG AGTATTCCTT TTAATGCAAA 240 

TTGGTTAAAT CAACAATATG CTGAAATTAT CCAAGCAATC TTATTTGATG TTGTAGGCTA 300 

TGAAGTTAAA CCTCACTTTA TTACTACTGA AGAATTAGCA AATTATAGTA ATAATGAAAC 360 

TGCTACTCCA AAAGAAACAA CAAAACCTTC TACTGAAACA ACTGAGGATA ATCATGTGCT 420 

TGGTAGAGAG CAATTCAATG CCCATAACAC ATTTGACACT TTTGTAATCG GACCCGGTAA 480 
CCGCTTTCCA CATGCAGCGA GTTTAGCTGT GGCCGAAGCA CCAGCCAAAG CGTACAATCC ' 540 

mTTATTTATC TATGGAGGTG TTGGtTTAGG aAAAACCCAT TTAATGCATG CCATTGGTCA 600 

TCATGTTTTA GATAATAATC CAGATGCCAA AGTGATTTAC ACATCAAGTG AAAAATTCAC 660 

AAATGAATTT ATTAAATCAA TTCGTGATAA nA 692 
(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 
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ATACTGTAGC GCAAATTTCA CAATGGCATG 


TTATAGAAGA 


TTTAGTTACG 


AATGAATTAG 


60 


GTATTAGTAT TTTACCAACA TCAATTTCAG 


AgCAACTAAA TGGAGATGTG AAGCTGtACG 


120 


CATTGAAGAT GCTCATGTAC 


ATTGGGAATT 


AGGTGTTGTT 


TGGAAGAAGG 


ATAAACAATT 


180 


AAGTCATGCC ACAACGAAAT 


GGATAGAATT 


TTTGAAAGAC 


CGTTTAGGCT 


AACATATTAA 


240 


TAAAGCACTC ATTATTTAAG 


GCGCATCATT 


ACGTGGGTCA 


TTGAAATAAT 


GAGTGTTTTT 


300 


TTGTGAAAAT GAAGTGAAAT 


TTAGAGAGCG 


TTTCCATAGA AAATAGTAAT ACAAACTATA 


360 


AAAAAAGAGT ATTTTTATAT 


TGTGTACGCC 


ATCTTTATAA 


TAGTTATTGT 


AACAATTTAG 


420 


ACATATTTAG AAAGGGATGG 


CGCCATGCAC 


AAAGTCCAAT 


TAATAATCAA 


ACTACTACTA 


480 


CAACTAGGAA TCATCATTGT 


GATTACTTAT 


ATTGGCACAG 


AAATTCAAAA 


GATTTTTCAT 


540 
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ATTGTACCGC 


TAACTTGGGT 


AGAAGACGGT 


GCAAACTTTT 


TATTAAAGAC 


GATGGTCTTT 


660 




TTCTTCATAC 


CGTCAGTTGT 


AGGtATTATG 


GaTGtgCTTC CGAAATTACG 


CTAAATTATA 


720 


5 


TACTCTTTTT 


CGCAGTCATT 


ATCATAGGAA 


CATGTATCGT 


TGCATTATCT 


TCAGGTTATA 


780 




TTGCTGAAAA 


AATGTCyGtT AAACwTAAAC ATCGTAAAGG TGTAGACGCt 


TATGAATGAT 


640 




TACGTGCAAG 


CCTTATTAAT 


GATTTTGTTG 


ACTGTCGTTT 


TATATTATTT 


CGCTAAAAGG 


900 


10 


TTACAACAAA 


AATATCCGAA 


CCCATTTTTG 


AATCCAGCAT 


TAATTGCATC 


TTTAGGAATT 


960 




ATTTTTGTCT 


TACTTATCTT 


TGGAATTAGT 


TATAACGGGT 


ATATGAAAGG 


TGGCAGTTGG 


1020 


15 


ATCAACCATA 


TTTTAAACGC 


AACGGTCGTA 


TGTTTAGCGT 


ACCCACTTTA 


TAAAAATAGA 


1080 


GAGAAAATTA 


AAGACAATGT 


CTCTATCATT 


TTTGCAAGTG 


TATTAAcTGG 


CGTCATGCTG 


1140 




AATTTCATGT 


TAGTGTTCTT 


AACACTTAAA 


GCATTTGGCT 


ATTCTAAAGA 


CGTCATTGTA 


1200 


20 


ACGTTATTGC 


CCCGATCTAT 


AACAGCCGCA 


GTAGGTATCG 


AAGTGTCACA 


TGAACTAGGT 


1260 


GGTACAGATA 


CGATGACCGT 


ACTTTTTATT 


ATCACAACGG 


GTTTAATCGG 


TAGTATTTTA 


1320 




GGTTCGATGT 


TATTAAGATT 


TGGAAGATTT 


GAATCTTCTA 


TCGCCAAAGG 


ATTAACGTAT 


1380 


25 


GGGAATGCGT 


CACATGCATT 


TGGCACAGCT 


AAAGCACTAG 


AAATGGATAT 


TGAATCCGGT 


1440 




GCATTTAGTT 


CAATTGGGAT 


GATTTTAACT 


GCAGTTATTA 


GTTCAGTGTT 


AATACCTGTT 


1500 




CTAATTTTAT 


TATTCTATTA 


ATTTAGATAT 


TTAAAATGAT 


AGACAGAAAG 


GGAGGCTATT 


1560 


30 


AGTAATAATG 


GCAAAAATAA 


AAGCAAATGA 


AGCATTAGTT 


AAAGCATTAC 


AAGCaTGGGA 


1620 




TATAGATCAC 


TTGTATGGTA 


TTCCAGGAGA 


CTCAATCGAC 


GCATAGTCGA TAgTTTACGT 


1680 




ACAGTGAGAG 


ATCAATTTAA ATTTTATCAT 


GTACGTCATG 


AAGAAGTAGC 


AAGCTTAGCG 


1740 


35 


GCTGCTGGTT 


ACACAAAATT 


AACTGGTAAA 


ATCGGTGTGG 


CATTAAGTAT 


CGGTGGCCCT 


1800 




GGTTTAATTC 


ATTTATTAAA 


TGGTATGTAT 


GATGCCAAAA 


TGGATAATGT 


ACCGCAATTA 


1860 




ATATTATCTG 


GACAAACGAA 


TAGTACAGCA 


CTTGGAACGA 


AAGCATTCCA 


AGAAACAAAT 


1920 


40 


TTACAAAAAT 


TATGTGAAGA 


TGTAGCCGTT 


TATAATCACC 


AAATTGAAAA 


AGGTGACAAT 


1960 




GTGTTTGAAA 


TCGTTAACGA AGCAATTCGT 


ACGGCATATG 


AACAAAAAGG 


TGTAGCTGTT 


2040 




GTTATTTGTC 


CTAACGACTT 


ATTAACTGAA 


AAAATTAAAG 


ATACAACGAA 


TAAACCAGTA 


2100 


45 


GATACATCAA 


GACCAACAGT 


AGTATCACCA 


AAATATAAAG 


ACATCAAAAA 


AGCGGTTAAA 


2160 




CTAATTAATA 


AAAGTAAAAA 


GCCTGTCATG 


TTAATTGGTG 


TAGGTGCGAA 


ACATGCGAAA 


2220 


50 


GATGAGCTAC 


GTGAATTTAT 


TGAAATGGCT 


AAAATTCCTG 


TCATTCATTC 


ATTACCAGCT 


2280 


AAAACAATCT 


TGCCGGATGA 


TCATCCATAT 


AGTATCGGtA ACTTAGGTAA AATCGGTACC 


2340 
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CCATATGTGG 


ATTACTTACC 


TAAGAAAAAT ATTAAAGCCA 


TTCAAATTGA 


CACAAATCCT 


2460 




AAAAATATCG 


GACATCGTTT 


CAATATTAAT 


GTAGGAATTG 


TTGGAGATAG 


TAAAATTGCG 


2520 


5 


TTGCATCAGT 


TAACTGAAAA 


TATTAAACAT 


GTTGCTGAAA 


GACCATTCTT 


AAACAAAACG 


25B0 




TTAGAACGTA AAGCGGTTTG 


GGATAAATGG 


ATGGAACAAG 


ATAAAAATAA 


TAATAGTAAA 


2640 




CCATTACGTC 


CAGAACGATT 


AATGGCATCA ATCAATAAAT TTATTAAAGA 


TGATGCAGTG 


2700 


10 


ATTTCAGCAG 


ATGTAGGTAC 


AGCAACAGTT 


TGGTCAACTC 


GATACTTAAA 


CCTTGGTGTA 


2760 




AATAACAAGT 


TCATCATTTC 


AAGTTGGTTA 


GGTACAATGG 


GTTGCGGTCT 


TCCAGGTGCA 


2820 




ATTGCATCAA AAATTGCATA 


TCCAAATAGA 


CAAGCCATCG 


CAATTGCTGG 


TGACGGTGCA 


2880 


15 


TTCCAAATGG 


TAATGCAAGA 


CTTCGCTACA GCAGTACAAT ATGATTTACC 


TTTAACTGTA • 


2940 




TTTGTACTTA ATAACAAACA 


GTTAGCATTT 


ATTAAATATG 


AACAACAAGC 


AGCTGGTGAA 


3000 


20 


TTAGAATATG 


CAGTTGATTT 


TTCTGATATG 


GATCATGCAA 


AATTTGCTGA 


GGCAGCAGGT 


3060 


GGTAAAGGTT 


ATACAATTAA 


GAGTGCTAGC 


GAAGTAGATG 


CTATAGTCGA 


AGAGGCATTA 


3120 




GCACAAGATG 


TACCAACGAT TGTAGATGTA 


TATGTTGATC 


CTAATGCTGC 


GCCATTACCA 


3180 


25 


GGTAAAATTG 


TAAATGAAGA 


AGCGCTTGGT 


TATGGTAAGT 


GGGCATTTAG 


ATCAATTACT 


3240 


GAAGATAAAC 


ATTTAGATTT 


AGATCAAATT 


CCACCAATTT 


CAGTGGCAGC 


AAAACGTTTC 


3300 




TTATAACTGA 


TTTAAAGGTT 


ATCACAATTG 


AATTGAACTA 


TAAAAACGGT 


AATTTCTATT 


3360 


30 


TCAACAAAAT 


GGGAATTGCC 


GTTTTGTTTA 


TTTATCACAA 


ATGATCGTAC 


TGAATTGATG 


3420 




ATAAAATTGT 


GAAAAAGTTG 


TTGAAAACGC 


TTTTACAAAT 


ATGTATAATA 


GCTATGAATT 


3480 




AGATATCACT 


TGCGTGTTAC 


TGGTAATGCA 


GGCATGAGCA AACAACCGCA 


CTATGAGAAT . 


3540 


35 


AGTCTTGTTT 


GTTCATGCCT 


GCTTTTTTTG 


TACATGGAAG 


CGGAAATTGA 


GATAGGGGAT 


3600 




GTTTfiTATGT 


TTAAGAAATT 


GTTTGGACAA 


TTGCAACGTA 


TCGGTAAAGC 


ATTAATGTTA 


3660 




CCTGTTGCGA TTTTACCAGC AGCTGGTATT TTATTAGCGT TTGGTAACGC 


AATGCACAAC 


3720 


40 


GAACAATTAG 


TAGAAATTGC 


ACCATGGTTA 


AAAAACGATA 


TCATTGTAAT 


GATTTCGTCG 


3780 




GTCATGGAAG 


CAGCAGGACA 


AGTTGTATTT 


GATAACTTGC 


CATTATTATT 


TGCAGTTGGT 


3840 




ACAGCACTTG 


GATTAGCAGG 


AGGAGACGGT 


GTTGCAGCAT 


TAGCAGCGCT 


AGTAGGTTAC 


3900 


45 


TTAATTATGA 


ATGCAACAAT 


GGGGAAAGTG 


TTGCACATTA 


CAATTGATGA 


CATTTTCTCA 


3960 




TATGCCAAAG 


GGGCAAAAGA 


ATTAAGTCAA 


GCAGCGAAAG 


AACCAGCACA 


TGCTTTAGTA 


4020 




TTAGGTATTC 


CAACGTTACA 


AACGGGTGTG 


TTTGGTGGTA 


TTATCATGGG 


TGCTTTAGCC 


4080 


50 


GCATGGTGTT 


ACAACAAATT 


TTATAATATT 


ACACTACCAC 


CATTTTTAGG 


ATTCTTTGCA 


4140 
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AGCTTTGCGT GGCCACCAAT TCAAGATGGA TTAAATAGTT TATCGAATTT CTTATTAAAT 4260 

AAAAATTTAA CATTAACAAC GTTTATATTC GGTATTATTG AACGCTCATT AATTCCATTT 4320 

5 GGTTTACATC ATATTTTCTA TTCACCGTTC TGGTTTGAAT TCGGAAGTTA TACAAATCAC 4 380 

GCAGGTGAAT TGGTTCGTGG TGACCAACGT ATTTGGATGG CACAATTGAA AGATGGCGTA 4440 

CCATTTACTG CTGGTGCATT TACTACTGGT AAATATCCAT TTATGATGTT TGGTTTACCA 4 500 

10 GCGGCGGCAT TTGCTATTTA TAAAAATGCA CGACCAGAAC GTAAAAAAGT CGTGGGTGGT 4560 

TTAATGTTAT CAGCAGGATT AACTGCATTT TTAACTGGTA TCACTGAGCC ATTAGAATTT 4620 

TCATTCTTAT TTGTAGGACC AGTACTTTAT GGAATTCACG TATTATTAGC TGGTACATCA 4680 

75 

TTCTTAGTAA TGCATTTATT AGGCGTTAAA ATTGGTATGA CATTCTCAGG TGGTTTCATA 474 0 

GATTATATTT TATATGGTTT ATTAAACTGG GATCGTTCAC ACGCATTATT AGTTATTCCA 4 800 

GTCGGTATTG TATATGCTAT CGTGTATTAC TTCTTATTCG ACTTTGCAAT TCGTAAGTTT 4860 

20 

AAATTGAAAA CACCAGGTCG TGAAGATGAA GAAACTGAAA TTCGTAACTC TAGTGTCGCA 4 920 

AAATTACCAT TTGATGTCTT AGATGCAATG GGTGGAAAAG AAAACATTAA ACATTTAGAT 4 980 

GCATGTATTA CACGTCTACG CGTAGAAGTG GTTGATAAAT CAAAAGTAGA TGTAGCAGGT 504 0 

25 

ATTAAAGCTT TAGGCGCATC AGGTGTATTA GAAGTTGGAA ACAATATGCA AGCTATCTTT 5100 

GGTCCAAAAT CAGATCAAAT TAAACATGAT ATGGCCAAGA TTATGAGTGG TGAAATTACG 5160 

AAACCAAGTG AAACGACAGT GACTGAAGAA ATGTCAGATG AACCAGTTCA CGTAGAAGCA 5220 

30 

CTTGGAACAA CAGACATCTA TGCACCAGGT ATCGGTCAAA TCATTCCATT ATCAGAAGTA 5280 

CCTGATCAAG TATTCGCTGG TAAAATGATG GGTGATGGTG TTGGCTTTAT CCCTGAAAAA 534 0 

35 GGTGAAATTG TAGCACCGTT TGATGGTACA GTGAAAACAA TCTTCCCTAC GAAACATGCG 5400 

ATAGGATTAG AATCTGAAAG TGGCGTCGAA GTACTTATTC ATATTGGTAT CGATACAGTG 5460 

AAACTGAATG GTGAAGGATT CGAAAGTCTG - ATTAACGTTG ATGAAAAAGT AACACAAGGT 5520 

40 CAACCATTAA TGAAAGTGAA TTTAGCATAC TTGAAAGCAC ACGCACCAAG CATCGTTACA 5580 

CCAATGATTA TTACAAATCT TGAAAATAAA GAACTTGTCA TTGAAGATGT ACAAGATGCT 5640 

GATCCAGGTA AGCTAATTAT GACAGTCAAA TAATGATTAA AAATGAAACA GCATATCAAA 5700 

45 TGAATGAACT TTTAGTCATT CGTAGTGCGT ATGCGAAGTA GCGAGTTGAA AGAGAATACG 5760 

TTACAAAAGG CAGTAGCTTA AAATGAAGCT ACTGCCTTTT TAGTGCGCAA TGATGTATAG 5820 

CAGGTGTGTT GATGrTAATA AGTTAAATAT TAGTGTTAGA TATAGAAAAC ATTGCTTATG 5880 

50 TTTTTGTCAC ATTTTAGAAA AATGCATCTT CGCGACTAGC CAAATTAATA GTCTCATTGA 5940 
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AATAAATTAA CATGATTTTA AATCTATTTG TAAGATAAGG AGATTTGTCA TTATGACAAC 6060 

AGAAGGTCTA TTAGTTGCAG AGAAAGAAAT CGAAGTGAAT GGTTACGACA TTGATGCGAT 6120 

GGGTGTCGTT AGTAATATCG TTTATATTAG ATGGTTCGAA GATTTGAGAA CAGCGTTTAT 6180 

TAATCAGCAC ATGAATTACT CAACAATGAT CAATCAAGGC ATTTCACCTA TACTTATGAA 6240 

AACGGAAGCA GAGTATAAAG TACCTGTCAC AATACATGAC AAACCAGTAG GTCGTATTTA 6300 

CTTAGTTAAA GCAAGCAAGA TGAAATGGGT GTTTCAGTTT GAAATTGTGT CCGCACATGG 6360 

CGTGCATTGT ATTGGTACAC AGACAGGCGG TTTTTACAGA TTGAGTGATA AGAAGATAAC 6420 

CTCTGTGCCA CAAGTGTTTC AAGACATTTT AGCAACAAAA TAATGACTTC ATTTTAAAAT 6480 

ATAAAAAGTA AGAAGGTGTT CGAAATGGTT AAGCAATTAA ATAGTGTCGA AGCATTCCGT 6540 

GAATTTATTC ATCAATATCC GTTAGCAGTT GTACATGTCA TGCGCGATCA GTGTAGCGTG 6600 

TGTCATGCCG TTTTACCACA AATTGAAGAC TTGATGCAAT CATATCCCAA TGTGCCATTA 6660 

GCTGTGATTA ATCAAAGTCA GGTGGAAGCT ATTGCTGGAG AATTAAATAT TTTCaCTGTA 6720 

CCTGTGGATT TAATTTTTAT GAATGGAAAA GAAATGCATC GTCAAGGGCG TTTTATCGAT 6780 

ATGCAACGTT TTGAACATCA TCTTAAGCAA ATGAATGATA GTGTAAATAA CGATGTCGAT 6840 

GAGCATTAAT ATCGCAAATG ATTAGCATTG CTAAGATTAT GTAGACATCA TAACTTATTT 6900 

CCCAGTAAAT ATTGGTAGTA ATTAGAATCA GCATGGTACA GTAGAACTAT AGTAGAAATC 6960 

ATCAAAGAGG AGTGACGACA AATGCGTAAA AAATGGTCTA CACTTGCGTT TGGATTTTTA 7020 

GTTGCAGCAT ACGCACATAT TAGAATTAAA GAAAAACGCA GTGTGAAAAG TTATATGTTA 7080 

GAACAAGGTA TACGATTATC TAGAGCTAAG CGTCGTTTTA TGTATAAAGA AGAAGCGATG 7140 

AAAGCATTAG AAAAAATGGC GCCACAGACA GCAGGCGAAT ATGAGGGAAC CAATTATCAG 7200 

TTTAAGATGC CAGTAAAAGT GGATAAGCAC TTCGGTTCAA CCGTTTATAC CGTTAACGAT 7260 

AAACAAGATA AGCATCAACG CGTTGTATTA TATGCACATG GAGGCGCATG GTTCCAAGAC 7320 

40 CCACTCAAAA TTCATTTCGA ATTTATTGAT GAACTTGCAG AAACACTCAA TGCTAAAGTC 7380 

ATCATGCCAG TATATCCGAA GATTCCGCAT CAAGATTATC AAGCGACGTA TGTGCTTTTT 744 0 

GAAAAGTTGT ACCATGATTT ATTGAATCAA GTAGCAGATT CTAAACAAAT CGTTGTAATG 7500 

45 GGTGACTCTG CGGGCGGTCA AATTGCTTTA TCATTTGCTC AATTGTTAAA AGAAAAACAT 7560 

ATTGTGCAAC CAGGACATAT TGTATTAATT TCACCAGTTT TAGATGCAAC GATGCAGCAT 7620 

CCTGAAATTC CTGACTACTT AAAGAAAGAC CCAATGGTAG GTGTGGATGG CaGTGTGTTC 7680 

TTAGCTGAAC AATGGGCAGG GGACACACCT TTAGATAACT ACAAAGTATC ACCAATTAAT 7740 
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CCAGATGCTT TGAACTTATC GCAATTGTTG AGTGCGAAAG GTATCGAACA TGACTTTATA 7860 
CCTGGATATT ACCAATTCCA TATTTATCCA GTATTTCCGA 7900 
(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1984 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

GTCTAAATAA ACAAAATTAT CATTGATTaC TGAACTGGCA TTTCGAAGTA ATGCTTCAAT 60 

ATCATTCGAA TATTTCTTCA ATTTATGATT GTGAAATAAT TCTTGCATCA AAAATGGTCT 120 

TTGGTCACAT GAATGTGCAT CTGAAGCTAC AAAATGAGCC AAATTACATT CTATAAATTG 180 

TAATGATAAC TTTTGAATGT TTTTACCAAA TCCACCAACT AAAGAACTCG ATGTTAATTG 240 

ACTCAGTGCC CCATTTGCAA CCAATTCATA TAATATTTCC GGATTTTTGG CGATACTTCT 300 

ATTTCTCTCA GGATGTGCAA TGATTGGTAT GTAACCTCTC GATTGTATTT CAAAAAACAA 360 

TTGTTTTGTA TAATGTGGTA CTTCGCCCGT TGGAAATTCA ATTAATAAAT ATTTCGAACG 420 

ATTAATACCT TGAATACTAC CATTATCTAA GCCTTTCAGA ATCGAATCTG TAATTCTAAT 4 80 

TTCTTGCCCG GGAAATAATT TAATATCCAA TGCTTGAACT TCTGGATGCG TTCTTAACTC 54 0 

CGCCAATTTC ACAAGCACTT GTTGAAATGT ATTATCATAT CTCGGATGCA AATGATGAGG 600 

TGTCGCTACA ATACTTGTTA CACCTTCATC CTTAGCTTGC TTTAATAGTG CAATACTCTT 660 

35 TTCAATTGTT TTAGGACCAT CATCTATATC AACTAATATA TGGTTATGAA TATCAATCAT 720 

GATTCATCAG TCCCATAATA TGCATAGTAA CTAGCACTTT TATCTTTAGG CATTCTATTT 7 B0 

AAGACTACAC CTAATAATTT AGCACCTGTT GCTTCAATAA GTTCTTTTCC TTTTTTAACT 840 

40 TCATCTCTAT TATTATTTTC CGAATTAACT ACGTAGACAA CATTGCCGGT AAACTTTGAA 900 

AATAATTGCG CATCTGTAAC TGTGTTCACT GGTGGCGTAT CGATAATTAC AAAGTTATAA 960 

TTCATCAATA ATGTGTCATA CAAATTTGCA AATGCCCTTG ATGTAATTAA CTCTGACGGA 1020 

TTCGGTGGGA TTGGCCCAGA CGTCAAGACG TCTAAATCTT GAATTTCAGT TGAGATAATA 1080 

CTGTCTTGAT AAGTTGACCA ATTTAGCAAT AAACTTGATA GGCCTTCATT GTTTGGCAAA 1140 

TTAAAAATAT AATGCTGCGT AGGTTTACGC ATATCCCCGT CTACGATTAG TGTTTTATAA 1200 

CCTGCTTGCG CATATGCAAC TGCTAAATTT GCTGCAATTG TAGACTTACC TGCGCCTGGT 1260 
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GATCTTATGC CTCGAAATTT CTCGCTAATA GGTGACTTTG GTTGTTCATG GACAATTAAA 13 80 

CTTGATGTAC TTCyTCGTGT ATTCGTCATG GTAATTCCTC GTAAATTAAA ATTTTTGTAT 1440 

TGAACCTAAA ATAGGTAATC CTAGTTG CGA TTCAACATCT TCTTCTGTCT TAATACGCTT 1500 

ATCTAATAAT TCTTTTAAGA AAATAATCAA TATTGCTAAA ACAATACCAA CAATAATGCT 1560 

GATAACTAAG TTGACAGATA CTATTGGAGA TACTTTTACA GCATTATCAT GTGCTGAGGA 1620 

AAGTATCGTA ACATTATCAA CACTCATAAT TTTAGGCATG TCATGAGCAA AAACTTTAGA 1680 

TATTTTATTA ACAATTTTGT CAGATTCAGA TTTATTCCCA GTGGTAACTG ATACAGTAAT 1740 

AATTTGAGAG TTTGTTTGAT TGGTTACTTT TAAAAATGAA TTCAACTCAG CTGTTGAATA 1800 

CTGACCATCA AnTTCTCTAG ATACTTTATC TAGAATTCTA GGACTTTTGA TAATTTCCGT 1860 

ATATGTATTA ACAGACTGCA AACTACTTTG AACATTTTGG AAAGCTAAAT CACTTGAGGA 1920 

CTTTTTCATG TTCACTAATA TTTGAGTAGA AGCAGTATAT TTGTCAGGCA TAACAAAAAA 1980 

GGTT 1984 
(2) INFORMATION FOR SEQ ID NO: 140: 

• (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 6272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 0: 

CAAATCCCTT GGTGATGAtA AAtGtATTGC TGTGTAGCCA AATAATCTTC GTATATATGA 60 

35 CTGACGTTCA ACAACAGCTT GCAATCGTTT CGTTGGTACA GTTACTTTCT TCTTGTTAAA 120 

GAGACCATAT TCAATTTTAA GTTGCTCATT TTCAAGCATC ACCGAAAAGC CATAAAATCT 180 

TATCATTGTT ATAATCGTTC CAATAATATA TGCCACTATT AATACTAGTA AAATGATGAT 240 

TAATACTGAA ATACTTACAA TTTGAACCCA TTGACTAATT TCATGATTTA GCTTCGACCA 300 

TGGGATCAAC TCTCTTACAG CCCCGTAAAT CGGTACTAAA GCTGCTAACG TTACACCAAT 360 

GGCGCCACTG GTCATTGCCA TAAATAGTGA TTCTTTAAAA TTCATCTGAT ATATAGGAAT 420 

GCGTTTATTT TTCTGATTAA GCATACTATC AGTGTTCTGC ACTTCATCTA AGCGACCTTC 480 

TGCGATGTCT TCCACATTAC CTTCAATGTC ATGATTACAG TTGTCATTCT TCTCAGCACT 540 
AGACTTTTGC GCCACTTCTG TCTTCAACTC TGTTTGCAAT TGATCAATAT ATCGTTCAAG 
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ATATTCACCT TGTTTTTTCG AAATAACACT TAAGACAATA CCATCACTTG GTGTTTTGAT 660 
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AATACGTTTT ATATTTAATT CTTTACGCTT TTTATTAAAA ATACCTGTTG TTAAAATGAA 780 

ATAATTATCC tCAATCCAAT ATCGCGTGTT CATAATTCCG ACAATTTGAG AAATGTATGA 840 

5 

TATTAAAAAG AATACAAATA CAATACCTAT CCATAAATAT GATTCGGGAT TCGTATAATC 900 

AAAATCTTTC AATTGAAAGA TAATGAAAAT AAAAAAGACG ACTATGTTTT GTTTGATAGC 960 

ATTGATTATG CCATTAAAAT ATGAAATCGG ATGTAATTTT TGAGGTTCAG ACATCACTTT 1020 

10 

CAACCCCTCT CAAATTCGAC ATAGTTCTCT CTTCGATTAT TTTAACATCG TCATGAGACA 1080 

TCATCGGTAA ATAAATAGTA TGACCTGCAG TCATAAATCC AACTTTATAC AAATTAAGCA 1140 

15 CTTTACTAAT TGGATTAGAT TTAATCGACA AGTATTGTAA ACGTTCAATT CGACTCGTTT 1200 

CTTCTTTATA TATAAAAAAT GATGTACGAT ATTGTACACT TAGTTGATCA ACTTTATAAA 1260 

AGCGACAATG ATATTGCCAT AAAGGCTTAA TAAATAATTT TAATGTACTC AGAGCACCTA 1320 

20 AAACCAACAA AATATAAAGT AAGTAATGTG GCCATTCAAA TCTTAACCAT ATAAAATAAA 1380 

AAATGACATA CACAGCTACA CTCAATATAA ATTCTAAGCC ATTCGTAATG TAGTAATACA 1440 

ACAATGCTGA CTTAGGACTC TTAGTCAACT TAGTATAATC TGACATATAC CCCTCTCCCC 1500 

25 

AAATAAAAAA TTATACGGAT TTATAATCTA TTTCATTTTA TTTTTATATG ATGATAATTA 1560 

TAGCATATGG AATATTTCAT GCTAATTTAT TCTTCCTAAA GGTACATCTA AAAATTTAAT 1620 

TAAGCAGAAA GTGCTTGAAT TGCTAAAAAG ACACCATGTT ATAATTTTAT CAACATGATG 1680 

30 

CCTTTCATCT ATAATCAATC TTTCATCTTA TCAAGAGCGA TATTTAGTTC AAGCACATTC 1740 

ACATAATCAT TTGTTAACAC ACCACGCTGC TTACGATGTT GAATCAAGTC GGCCACTCTT 1800 

35 GAAGTAGATA CATGACGAGC ATCAGCAATA CGAGGTGCTT GCTTCAATGC ATTTTCGACC 1860 

GTAATATGCG GATCTAAGCC CGACCCAGAA CTTGTTGCAG CATCTATTGT TACATTTGAA 1920 

TTCCCAAATT TAACATGATG TTTCATGCGT GCTATTAATT CGGTGTTTCC ATTCGATTCA 1980 

40 TTACTTCCAC CTGAAGATAC GCCGTTTTTA TATAATTTTT CAGGATTCAT ATTATAATCA 2040 

ACTGCACTCG GTCTCCCGTG AAAATATCGT GTCTCTGTCC AGTGCTGTCC AATCAATTTT 2100 

GATCCAACTA TACGATTGTC ATACGTAATT AAACTGCCAT TTGCTTGTTG ATAAAAAAAT 2160 

45 

ATTTGACCAA TTAACGTGAT AGCTAAC GGG AATAAAAATC CACATAATAC CATAGTTATT 2220 

ATCGTTAAAC AAATACTATT TCTTATCGTA TTCATGGTAC AGGCTCCTTC CTCTTTACAC 2280 

5Q AAAAAATTGT ACAATCATAT CTATTAATTT AATGCCTAAA AACGGGACGA TTAATCCACC 234 0 

TAATCCATAA ATCAACATAT TATTTATAAA GATTCTATCA ATGCTGTAAC CCTTTACTTT 24 00 

TACACCTTTC ATGGCAATTG GAATTAAGGC AACAATGATT AATGCATTGA ATATCAAAGC 24 60 
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AATTGTTGAC ATCATTAGTG CAGGTAAAAT 
ACTAAATGTC GTTAATGCAC CTCTCGTCAT 

5 

TATTAACTTT GTAGGATTCG AATCTAAATC 
TGTCCCTGAG TTCATAGCTA ATCCTATATT 
TACCATCTCC TGTCATCGCA ACAATATGGC 

10 

TTTTATCTTC GGGTTTACAC TCTGCAACAA 
TAGCTGCTGT TAAAGCATTA TCACCTGTAC 

is ATTCAGTAAA TCGTTCTACA AGACCATCTT 
GCATGACATT GTTTTCAATG ACTATTAAtG 
AGAGAGACTC AATATTAAGA GGAATATTGC 

20 TATTAGGTGC ACCTTTGAAT ACCGATATTT 
CAGCTGTAAA AGGCTTATAT GTGCCATCAA 
GcttCGCTAA TCGTACAATA CTTTTTCCTT 

25 

AAGCAGCGAC TATCAATTTT TCAAGCATTT 
TTCGATTGCC ATAAGTGATT GTGCCTGTCT 
ATACTTCTAC AGCACGCCCA CTTTTCGCTA 

30 

CTGCAATACC AATCGCCGAT AACAAACCAC 
ACGCAATGAG CATCGCAATA GGTAAAATTA 

35 TTACAATAAC GACTAAAAAT ATAATTGTTA 
CATTTGGTGT TTTATTTCTT TCCGCCCCTT 
ATGTACnCGC TTCACTCTCA ACACGTATTT 

40 CAATGACTCC ATCAAAATCG CCACCTGATT 
TTGCAGATTC ATCAACGGTT GCTAATCCAT 
CTCCATTTTC TACCCGAATA TTTTGTCCGG 

45 

ACGCACCATT TTCTTCTATC AATCGAGCAG 
CAGCTTGCGC TTTTCCACGA CCTTCAGCAA 
TTATTAATAA TATGATAAAA ATTGTAATCA 

50 

ATATGTCAGG AAAACATATT AATATCAACG 
TTATCGGATT TTTTATTAAT TGTTTAAGAT 

55 



TGCAAAGTAT TTTGCTACGT CATTAGCCAA 2580 

TAATAATTGT TTGCCTATTT TTACAACCTC 2640 

AATTAGATTA GCTGCCTCTT TAGCACTAAT 2700 

CGCTTtGTGc tAGCGCAGGT GCATCATTTG 2760 

CTTTCGCTTG TTCATCTTTG ATGACTTTAA 2820 

ATCTATCAAC CCCGGCTTCT TTTGCAATTG 2880 

ACATAACTGT TTCAATCCCC ATTTTTCTCA 2940 

TAATCACATC TTTTAAATAA ATCACGCCAA 3000 

GnGTGCCACC TTTACTCGAT ACATCCATAC 3060 

CTTGTTGTTG TTTGACAAGA TTTATCATAC 3120 

CATTTGTAAT GATTCCGCTC ATTCTAGTTT 3180 

TGTCTTTAGG CAGCTCATTT ATATACATcT 3240 

CTGGCGTATC ATCGTAGATT GATGACATAT 3300 

GTTGATTCAC TGGTAAAAAT TCACTAGCGA 3360 

TGTCTAAAAT CATTACATCG ACATCTCCAC 3420 

ATACATTGAA TTGAGTAACA CGATCCATGC 34 80 

CGATTGTCGT TGGTATTAAA CATACTGTTA 3540 

AATGCAGGTA AGATGCTATT GGATATAACG 3600 

ACGTTGTTAA TAATGTAAAA AGTGCAATTT 3660 

CAACTAAGGC AATCATTTTA TCTAAAAAAG 3720. 

CTAACCAATC AGATGTTACA AGTGTACCGC 3780 

CTTTTATCAC AGGTGCAGAC TCACCAGTAA 3840 

TTATTACAAC GCCATCAGCA GGGATTGTTT 3900 
CTTTTAACTC TGTGGCGTTC ACTATCCGAT . 3960 

TTAAATTTGA TTGTGCTTGT CTTAAACTAT 4 020 

AGGCTTCTGA AAAATTAGCA AACAATATAG 4080 

AATAACCTCG CGATAGATAG CTAGTTCCAA 4140 

TTAAAATCAT TCCAACCTCA ACGACAAACA 4200 

TCAGCTTATA AAAACTCATT TTCAAAGCTT 4260 
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TTTATTTTAA AGTTAAAAAT TCACCAATAG GACCAAGTAA TAGTACTGGA ATAAATGTCA 4380 

AACCACTTAG TAAAACGATA AATACGATTA GTGATACGCC AAAATAAGGT TTATCAATCG 444 0 

CTATTGTATA TTTATCTTGA TGGTATGATT TTTTATTCAC TAAACTTGAT GCAATCATTA 4500 

ATTGCAAAAT AATTGGTATA TAACGAGAAA GCAACATAAT GATTCCTGTA GAGATATTCC 4560 

AGAATGTTGT ATCATCTTTC AGTCCTTCAA ACCCTGATCC ATTGTTCGCA GCAGCTGATG 4 620 

TCATTTCATA CATAACTTGT GAAATACCAT GAAAAGACGG ATTCGTtATa CTTtCACTTG 4680 

CTCCAGGAAT CATAAAAGCA AGTGCTGAAA ATACTAAAAT TAAAATTGGG TGTATGAGAA 4740 

AGACTAAGAC AATACATTTC ATTTCACGGG CGCCAATTGG CATATTTAAA TATTCTGGTG 4800 

TTTTACCAAC CATCAAACTG CATATAAACA CCGTCAGTAA GACAAATATC AATAAATTCA 4860 

TGAGTCCTAC GCCTTCGCCA CCAAATACAA CATTTAGCAT CATTAATACC ATTGGTCCTA 4 920 

20 ATCCACCTAT AGGCGTTAAG CTATCATGCA TGTTATTAAC AGAACCCGTT GTAAATGCCG 4980 

TCGTAATAAC TGTAAATAGT GCTGACAAAC CTGCTCCAAA CCGTACCTCT TTACCTTCCA 5040 

TATTCGGTCC ATAAATGCCT AAATTCGCTA GTATTGGATT ACCACGATAC TCACTCCACA 5100 

TAGTTAATGT AAGAATTGCT ATAAAAATGA AAAACATTGC GACAAATAAT ATCAACGCAT 5160 

GACGATGTAC TCGTTTACCA TGTCTACTTA ACATGCGACC AAATAAGAAC AACATTGACA 5220 

TAGGAAGTAA CATCATACTG CCCATTTCTA TAAAATTGCT CCAAATATTT GGATTTTCAA 5280 

AAGGTGTTGC AGAATTTCCT GCTAAAAATC CTCCACCATT CGTACCAAGA TGTTTTATTG 5340 

ATTCAAGTGA TGCAATAGGT CCAAATGCAA TATGTTGAAT ATGTCCGCTT AAAGTCCGAA 5400 

35 TCATTAAATT AGCATGCAAC GTTTGTGGTA GaCCTTGAGT CATCAATAAA ATACTAATTA 5460 

AACATGATAA TGGTAAAAGT ACTCGGACAA TAAACCGAAC AATATCTTGA TAAAAATTAC 5520 

CAATGATATT AGTTAATCCA GTTAAACGTC TCAACATCGC TATACAAACG GCGTAACCTG 5580 

40 ATGCACTAGA TGTAAACATT AAATATGTCA TTACAATCAT TTGCGTTAAA TATGTCACAT 5640 

CTGaTTCACC GTTATAGTGT TGtAAATTAC TATTTGTTAA AAAAGATATT GCTGTATTAA 5700 

ACGCTAAATC TATCGATTGG TTTAAATTAT GATTTGGATT TAAAAAAAGC CATTGCTGAA 5760 

CTATTAGCAA TACAAATGTT ATAAACCCCA TAAATCCATT AAATGCCAGA AAATGTTTGA 5820 

CATATGTTTT AGCTGACATG TGTTCTAAAT CTGTGCCGAT AATTTTAAAA CACATATTTT 5880 

CAAATCTAGT AAATATTAAA TCTACTCTTG ACGATTGCAC CAATGCTACG CGATATAGAT 5940 

ATCCACTAAA AACATACGTA ATCATAACCA TCATTGTTAG AAACAAAATT ATTTCCATGA 6000 

TAACCCTCAC TTAATATATT TCTAAAATTT TTCACTACGA ATTAAGGCAT AAAATAAATA 6060 
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ACACAACAAC ATCGTAACAA CTTGTTTATG AGAGAAATnT TAATTTTCAA ACTTAGTTAT 6180 

TAAGAAAnCA TTAAGATGTG TATGCAGAAA TAAATTTTAT AGCATTTAAT TGTGAAGAAT 6240 

5 ATTATGATAT TGCTATCGAG GTGAAGGTTA TG 6272 

(2) INFORMATION FOR SEQ ID NO: 141: 

<i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1978 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

AAATGATGTT TTACAATAAA TATAnAAACG TATCAACATA TATCATCATA TTTTTAGTTT 60 

CAAGTGCAGC CTTTGCAATA TTCTTGTTAA GTGCGnACAT TAGTGCTCAC TCGGAACAAG 120 

TGTACGAAAT GACTGACCAT CAAATTAAGA ACAATACGAT AAATAAAGCA TACGAACATA 180 

AAGACCCTAC AAACAATAGC GAACAAAGAG ATGGGAAAGT GTTCGCTTTA ATAAATTGAT 24 0 

ACATTGTCAC AACGTTATTT TGCCTATTTT TGCGmAATAG CGTTTTTTAT TACwTTTTTG 3 00 

CTGATsTTAA ATTTGTTATA TTTTGTTAAA GTATTATAAT GATTGAATAA ACAAATTGAA 360 

GGTAGGTTTT TTAATTGAGT AATTCTGATT TGAATATCGA AAGAATTAAC GAGTTAGCTA 420 

AAAAGAAAAA AGAAGTAGGA TTAACTCAAG AAGAAGCAAA GGAGCAAACA GCCTTAAGaA 480 

AAGCTTATCT TGAGAGTT T T AGAAAAGGGT TTAAACAACA AATTGaAAAT ACTAAAGTAA 540 

TTGATCCAGr AGGTAATGAT GTAACACCTG AAAAAATTAA AGAGATACAA CAAAAAAGAG 600 

ATAATAAAAA TTAAATCACA AATCTGTAAA GAATTTTCTG ACATTATAAC TTGAAATAAG 660 

TATtTTACTT ATCTTTTTAT TTTAAAATAA GTTATAATGT ATTTGATAAA ATTGAAGAAG 720 

GGAAGATACA CAAGATGTTT AATGAAAAAG ATCAATTAGC TGTTGATACG CTACGTGCAC 7 80 

TAAGTATCGA CACAATCGAA AAAGCGAATT CTGGTCATCC AGGATTACCT ATGGGAGCTG 840 

CCCCAATGGC TTACACTTTG TGGACACGTC ATCTGAATTT TAATCCACAA TCTAAAGATT 900 

45 ACTTCAATAG AGACCGTTTC GTATTATCTG CAGGGCATGG TTCAGCATTA TTGTATAGCT 960 

TGTTACATGT TTCTGGTAGT TTAGAATTAG AAGAATTAAA GCAATTTAGA CAATGGGGTT 1020 

CTAAAACACC AGGTCATCCT GAATACAGAC ATACAGATGG TGTAGAAGTT ACTACCGGAC 1080 

50 

CACTTGGACA AGGTTTTGCT ATGTCAGTAG GATTAGCTTT ACAGAAGATC ACCTAGCAGG 1140 

gAAATTTAAT AAAGAAGGAT ATAATGTTGT AGATCATTAC ACATATGTAT TAGCTtCTGA 1200 
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AAGTAAATTA GTTGTTTTAT ACGATTCAAA TGATATTTCA TTAGATGGCG AATTAAACAA 1320 

AGCTTTTTCT GAAAACACAA AAGCTCGTTT TGAAGCATAT GGTTGGAATT ACTTACTAGT 1380 

5 TAAAGATGGT AATGATTTAG AAGAAATTGA TAAAGCGATT ACTACAGCTA AATCTCAAGA 1440 

AGGACCAACG ATTATTGAAG TTAAAACAAC AATCGGATTT GGTTCACCGA ATAAAGCAGG 1500 

AACTAATGGT GTTCATGGGG CACCTTTAGG TGAAGTTGAA AGAAAATTAA CATTCGAAAA 1560 

JO 

TTACGGTTTA GATCCTGAAA AACGTTTTAA TGTTTCAGAA GAGGTATACG AAATTTTCCA 1620 

AAATACTATG TTAAAACGTG CTAATGAAGA TGAATCTCAA TGGAATTCAT TATTAGAAAA 1680 

ATATGCAGAA ACATATCCTG AATTAGCAGA AGAATTTAAA TTAGCGATTA GTGGTAAATT 1740 

15 

GCCTAAAAAT TATAAGGATG AATTACCACG TTTTGAACTG GGTCATAATG GTGCATCTCG 1800 

TGCTGATTCT GGTACTGTTA TTCAAGCAAT CAGTAAAACT GTCCCTTCAT TCTTTGGTGG 1860 

20 ATCAGCAGAC CTTGCTGGTT CAAACAAATC CAATGTAAAT GATGCAACTG ATTATAGTTC 1920 

TGAAACACCT GAAGGtAAAA ATGTGTGGTT TGGTGTACGT GAATTTGCTA TGGGTGCT 1978 
(2) INFORMATION FOR SEQ ID NO: 142: 

2S . (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7588 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
TAGTAGTATT TATTAAATTA TACGAAGGGA CCcAACACAG AAAATTCATT TTATTGAATT 60 

35 

TTACATTTAT GTGCCAAGTT GGGAAAAATG TCTTATTTTT TCaAAGTATT TAAAAGTAAA 120 
ATTACATGTT AATACGTAGT ATTAATGGCG AGACTCCTGA GGGAGCAGTG CCAGTCGAAG 180 
ACCGAGGCTG AGACGGCACC CTAGGAAAGC GAAGCCATTC AATACGAAGT ATTGTATAAA 240 

40 

TAGAGAACAG CAGTAAGATA TTTTCTAATT GAAAATTATC TTACTGCTGT TTTTTAGGGA 300 
TTTATGTCCC AACCTTTTTA GAATATTAAA TTTCTACAAT TTCGTCATCT TCAACAATAA 360 
45 AGCCCATTGT ATTGACGCTG TTATTTAAGA AAGTCAGAAT ATAACGCATT ACTTCATCAC 420 

GTTCTGGCTC ATTGTGAACC TCGTGGTAAA AACCTTGCCA AGCTTTAAAA TATAATTCAG 4 80 

GTGTTTGATA TTTTTCTTTA AACTCATCAA TTGCCCTAGT ATCAACAATT AAATCCTTCG 540 

50 

TTCCATACAT TAATAGCGTT GGCATTGGTT GAATGTCATG AATATGAGCC ATCGTATCTT 600 
TCATCGTCTC ATTAATTGTA TTATACCAAT GATACGTTGC TTTTTTTAAC ATTAAACCAT 660 
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CATTAAAACG TGTGTCTTTT GAAATTTTAC CTATATTTGA AACAAGTTTA TCTTTACGAT 780 

TTTTTCCATT CTTTTGAAGT TCTAGCATAG GAGAAATTAA CATCATCCCC TCGATTGGCA 840 

ATTCTACTTT TTCAAGTAAA TTTAATAAAA TCAAACCGCC AAGTCCTACC CCTAATACAT 900 

AAGTAGGAAT TTTATATTCA TTAGCTATCT TTAACCAGTC TAGCAAACTT TCGTGATACG 960 

TTTGAAAGTT TTCAATTTGT CCTTTATTAG CTCTTGAAGT TTGACCTTGA CCAGGCAAAT 1020 

CTCCCATAAT CACATGATAG CCATTTCTTC TTAACATCGT AATAACATAT GCATATCTTC 10 BO 

CCGTATGTTC TAATATATTA TGAGCAATAA CAACGACGCC TTTCGCATCA TTTTCAGCTT 1140 

CCCACTTCCA CATTATTATA CTGCCCCTTT TTCATTAATC TTCAATAACA TAATTATAGC 1200 

AAATTCACTA TGTAGATTTC TATTTATAGT ATTATTGTTG TCCATATTAT TATATATAAA 1260 

TGAAATCAAC ATCAATAATA GTGTAATTAT ACATAATTAT TTTTGATTGT TTTTGATGAA 1320 

AACGCTTTCT CGAATATTTT TTTCATGCTA AACTTATTGT AAACACAAGG GTTTGGAGGA 13 80 

GTAGCAATGG CACTATTAAA GAATTTTTTT ATCGGATTAT CTAATAATAG TTTTTTAAAC 1440 

AACGCAGCAA AAAAAGTGGG CCCACGTTTG GGCGCCAATA AAGTCGTTGC CGGAAATACA 1500 

25 ATTCCAGAGT TAATTAATAC AATCGAATAC TTAAATGACA AGAATATCGC TGTTACGGTA 1560 

GACAATTTAG GGGAATTTGT CGGTACAGTT GAAGAAAGTA ATCATGCTAA AGAACAAATT 1620 

TTAACAATTA TGGACGCGCT TCATCAACAT GGCGTAAAGG CACATATGTC TGTTAAATTG 16 BO 

AGTCAGTTAG GTGCAGAATT CGACTTAGAA TTAGCTTACC AAAATTTAAG AGAGATTTTA 1740 

CTTAAAGCAA ATACTTACAA CAATATGCAT ATAAATATTG ATACTGAAAA ATATGCTAGC IB 00 

CTGCAACAAA TTGTTCAAGT TTTAGATCGC TTAAAAGGCG AATTTAGAAA TGTTGGTACT 1860 

GTAATTCAAG CATATTTATA CGATAGCCAC GAATTAGTTG ATAAGTACCA AGATTTACGA 1920 

TTACGTTTGG TTAAAGGTGC ATATAAAGAA AACGAATCAA TTGCATTTCA ATCTAAGGAA 1980 

GACGTAGATG CAAATTACAT CAAAATAATT GAACAACGTT TGTTAAACGC ACGCAATTTC 2040 

ACTTCAATTG CAACACATGA C CAT CG CATC ATTAATCATG TAAAACAATT TATGAAAGAA 2100 

AATCACATTG AAAAAGATCG TATGGAATTC CAAATGCTCT ATGGTTTTAG ATCAGAGTTA 2160 

45 GCAGAAGAAA TCGCAAATGA AGGCTATAAT TTCACTATTT ATGTACCTTA TGGCGATGAT 2220 

TGGTTTGCGT ATTTTATGAG AAGATTAGCA GAACGCCCAC AAAACCTATC TCTTGCTGTA 2280 

AAAGAATTTG TGAAACCTGC TGGCTTAAAA CGTGTTGGCA TAATTGCAGC TTTAGGAGCT 234 0 

ACAGTTATGT TAGGTTTAAG TACAATTAAA AAATTATGCC GTAAATAGAG CAAGACATAA 2400 

ACAATAATTT AGGAGTCTGG AACAATAATC AATGTTCTAG GCTCCTAAAT GTTATATTGG 2460 
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TAGATTTTAA 


TAAATTAGCC 


ATTTCAATTG 


CACTTACTGC 


TGCTTCAGCA 


CCTTTATTGC 


2580 




CAGCTTTCGT 


ACCTGCTCTT 


TCCACAGCTT 


GTTCAATAcT 


TTCAGTCGTT 


AAAATACCAA 


2640 


5 


ATATGACTGG 


TACATTAGTT 


TGATCATTCA 


CTTTAGAAAC 


ACCTTTCGCG 


ACTTCATTAC 


2700 




AAACATAATC 


ATAATGAGAC 


GTAGCACCGC 


GAATTACGCA 


TCCTAATGTA ATTACTGCAT 


2760 




CATAATTTCC 


TGATGAGGCT 


AATTTTTTAG 


CTACTAAAGG 


AATTTCAAAC 


GCACCTGGCA 


2820 


10 


CAAATGCTAC 


ATCAATATTG 


TCTTCATTAA 


CATCATGTCG 


AATCAAAGTA 


TCTTTTGCAC 


2880 




CTTCAAGTAA 


TCTTCCAGTG 


ATAAAATCAT 


TAAATCGACT 


AACTACGATT 


GCAACTTTCA 


2940 


IS 


AATCTTTTCC 


AATTAATTTA 


CCTTCAAAAT TCATGTTAAA ATCCTCCTAT ATTAAATGAC 


3000 


CCATTTTTAT 


TTTTTTCGTT 


TCCATATAAT 


CATGATTATG 


TACCGTTTCT 


GGTACGATAA 


3060 




CTTCAATTCT 


TTCTGCAATA 


TCAATGCCAT ATTGTTTTAA 


TCCCTCAAAT 


TTACTTGGAT 


3120 


20 


TATTACTTAA 


TAAATTGATA 


TGTTCGATGT 


TAAAATATTT 


TAAAATCTGT 


GCAGCAATAT 


3180 




GATAATCTCG 


CAAATCTTCA 


TCAAAACCTA 


ATGCTAAATT 


TGCAGTTACT 


GTATCATATC 


3240 




CTTGCTCAAT 


TAATTCATAT 


GCGCGTAATT 


TGTTTAACAA 


TCCTATGCCA 


CGACCTTCTT 


3300 


25 


GAGGTAGATA AATAATCATG 


CCACCATGTT 


CATTGATATA 


CTTCATAGAC 


GATTCAAGTT 


3360 




GAGCACCACA 


ATCACAACGT 


TGACTATGGA AAATATCGCC TGTAAGgCAC GCAGAATGTA 


3420 




AGCGTACATT 


TTCATGTTGT 


CGAATTGCAC 


CTTTTGTCAG 


TACAACTATC 


TCTTCATCTG 


3460 


30 


TGTATGTCGC 


TTTAAAACCA 


TACATATCAA 


ATGTTCCGAA 


ATCTGTAGGC 


ATTTTCACTT 


3540 




TTGCCTTAAA 


TTCAATTTCT 


GGTTCTAATT 


TTTTACGATA 


TTCAATTAAA 


TCATCAATCG 


3600 


35 


TAATCATCTT 


TAATTGATGT 


TTTTCTTTAA 


ACTTTTGTAA 


ATCTTGTCCT 


TTCGCCATCG 


3660 


TGCCGTCATC 


ATTCATAATC 


TCACAAATGA 


CACCAGCGGG 


CTTGGCACCA 


GTAAGTTTAG 


3720 




CTAAATCAAC 


AGCCGCTTCT 


GTGTGTCCAT 


TTCTAGCTAA 


TACGCCTTTA 


TCTTGTGCTA 


3780 


40 


CTAATGGAAA 


TAAATGACCA 


GGACGATTAA 


AATCTTTAGC 


TTCACTACTA 


GGATCAATGA 


3840 




GCTTTTTGGC 


AGTCAATGTA 


CGTTCATAAG 


CACTAATTCC 


TGTTGTTGTA 


TCTACATGAT 


3900 




CAATACTCAC 


TGTAAATTGC 


GTACCAAAGA 


TGTCGGAGTT 


ATCATCAACC 


ATTTGTACCA 


3960 


45 


AATCCAAACG 


TTGTGCAATA 


TCTTTAGACA 


CTGGTGCGCA 


TATTAATCCC 


CtTGCTTCTT 


4020 




TCGCCATAAA 


ATTAATGGTA 


TTATCGTTCA 


TCCATTCAGT 


AACCGCTACT 


AAATCACCTT 


4080 




CATTTTCACG 


ATTCTCATCA 


TCTACTACAA 


TAATTGGTTC 


TCCATTTTTT 


AAAGCCATTA 


4140 


50 


AAGCACTGTC 


AATATTATCG 


AATTGCATGC 


TACCCCTCCt 


AAAAACCAAA 


TGCTCTTAAT 


4200 




TTATCTACAG 


ATAATTGGTC 


TTTATCTTTA 


TTTAAAATAT 


TTTCAACATA 


TTTAAACAAA 


4260 
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CTCGTTTCTG GAATAAGATG AATGTCAAAA CTGTTATCAT GCTTATCAAA TACCGTTAGA 4380 

CTAACACCAT CCACAGTAAT AGACCCTTGC TTAACTAACT GATTATTAAT ATGTTGGCTA 444 0 

CATTGAATCG TAATAATTTT TGCATTGGCT GTTTCATTTA TTTTTGAAAC TGTTCCTAGT .4500 

TCATCTACAT GACCGAGGAC AAAATGTCCA CCAAACCTAC CGTTACCACT CATGGCACGC 4560 

TCTAAATTTA CTTCTGATTG TCGCTTAACA TCTGCTAAAT AGGTTTTATT TTCAGTGCCT 4 620 

TTAATTACTT GAACAGTAAA AGATGTCTGA TTAAAATCAA TCACTGTTAA ACATGCACCA 4680 

TTAACACTGA TGGAATCACC AATATGCATA TCTGCOGTAA TCTTATGTGC TTCAATTTCA 4740 

ATCGTCCTGA CTGATTGACG AATTTGAACA CTTTTAACGA CACCTATTTC TTCAACGATG 4 800 

CCAGTAAACA TGCATCATCA CTTCTTTCGT AAAGTTAATT TAACATTTTG ATTTAATAAC 4 860 

TCGGAATGAA CAATTTCAAA TTGGTTCGCA TCTGGTATCT CAATCACATC ATTTGTTTGA 4920 

TAAAATTGAT AATTTCCAGA TCCGCCAATT AATTTCGGGG CATAATAGAG AATAAATTCA 4 980 

TCTATATAAT TAGATTGGAG AAATTCTGAA GTAGTGGTTG GACCTGCCTC GACTAGCAAA 5040 

GTTCCAACTC CTCTTTTATA TAAATTGTGA AGAATTGTTG TTAAATCGCA AGACTTCAAG 5100 

25 TAAATAATTT CAATATGTGT TTGATTGGTT GTTAAATTTG GATTTTCAGT ATATATCCAA 5160 

ATTGGTGTTG ATTCATCTTG ATAAATTTGC TGATTAAAAT GAATATTCCC AGACTTAGAC 5220 

AATATTACTT TTATAGGGTT TTTTCCATCT TGAATACGTG TAGTATATTG TGGATCATCT 5280 

AATTCAACTG TACGTCTTCC AGTTAACACT GCGTCGTGTC GATGTCTTAA CTTATAGACA 534 0 

TCTTGTTTAA CCTCTTTGTT AGTAATCCAT TGACTTTGTC CATTATCATT CGCTTGTTTA 5400 

CCATCTAAAC TTGCAGATAC TTTCACTGTA ATTTGTGGCA GTTGCTTTGC TTTTGCTTTA 5460 

AAAAAGTCTT GGTATAATTG TGATGCCCGT TCATCATCAA CGCATTCAAC CTCAATACCG 5520 

TGAGCCCGTA ACGTCTCATC ACCATGTGTG TCTAACGAAT TGTCTTTTGT TGCGTATACT 5580 

ACTTTTGCTA TCTTACAATC AATTATTTTG TTAACACAGG GTGGTGTTGA ACCAAAATGA 5640 

CTACATGGCT CTAACGTAAT ATAAATCGTC GCACCTTCAG CATTTTGTTG TGCCATATCA 5700 

AGTGCTTGAA CCTCCGCATG CTTGTCACCT TTTCTCAAGT GTGCACCAAT ACCAACAATC 5760 

45 CTACCTTCTT TAACTACAAC AGCGCCAACG GGTGGATTAA CACCTGTTTG ACCTTGTACC 5820 

ATATTTGCAA GTTGAATCGC ATAATCCATA AATTGACTCA AATGATCACC TCTATAAACA 5880 

AAAATCCTCA CATCATGAAT TAAGATGCAA GGAGaAAAAT TTATCGTTAA ATAAGCCTAT 594 0 

50 

TTGTACACAT TTTTACAAAT ACGCTACATT ATCTTTGTCG ATAATTAACA TTCTTTCTCC 6000 

CATCCAGACT TTAACTGTCG GCTCTAGAAT CTCACTAGAT CAGCCACTAA TATGAAACAT 6060 
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TTaTATATGA AATTGTTATA GATTATTTGA GTACGTAGTA TGTCAACTAC ATTTAAAATG 


6180 


ATACTATATG TTTTCTGAAA AAACAATTAA TGACGGTTTT AATTTAATAT AATCTGAGTA 


6240 


CTATAGGCAT 


CTCATTGATA TGATTCTTAC TAACAGACAT 


TAAAATCAAA 


CCTTCAATTC 


6300 


GTCTCTATAG AGCGTTCTCT TTATTATCTT CTAGTTACAA 


ATTATTGATT 


GtCACtGCGC 


6360 


TGTTGTTGCT 


CATTCGATTC TAAAGCATCA TATAATTGAG 


ATACTGTATG 


CGCAACTTGT 


6420 


TCTACAATCA 


TTTTCACACC GTTTCGTAGT TTATTAACAC 


CGTTTGTCAT 


TTGACCTATC 


6480 


GCAATCATAT 


TTGTTAATGT TCCAAACCTT GGACTAATAA 


CTTGATTGGT 


TTCCGGAATG 


6540 


ATTTGTATGC 


CTCCCATTGG GTGTGCTTGT ACAATTTGTC 


TATTTTCAAG 


ATTTCTAATT 


6600 


AATTGATCAT 


CTTGATCCAA TTCATTTAAA TGACTTTTTG 


CACCTGTCGC 


GTTAATGACA 


6660 


ACATTATATA 


TGTCTACTGA TTCTTGGTTT TTGTATGAAA 


AATAATACAA 


CTTGCCATaC 


6720 


ATGTTCACAT 


CTTCTAAATC TTTTTTCAAA ATTAAAGACT 


TATTTTCTAT 


TAATTCAATA 


6780 


ATTAGTTCAG 


CAGTTCTTGG AGGCATTGGA TTTGAATTTA ATTGAATCAT CTTTGAGTAT 


6840 


TTTTGATTAA ATTGATGTTG GTCTTCAATA CTTAAGCTAT 


TCCATATCCA 


ATTTAAATTC 


6900 


TCTTTCAAAT 


GTTCAATCAT ACTTTGGAAA ATGCCCaTTT 


CTGTTGGACG 


CGCTAAATCA 


6960 


TACTTCAAAT 


CTGCAATATG ATTTCCTGTA CGTCTATGTA 


CTAATTTTTT 


AAAATCAATG 


7020 


TCATATTCAG 


CACATTCTTT TAAAAATAAA GAAACTAAAG 


TATCAAGCGG 


TGCATTGCCG 


7080 


AAATGATGTT 


TTTTAATGTC ATTTAATTTG TCTTTAGTTA 


AGTACTTGAA 


TGTCACGTCT 


7140 


ATCATTGTAC 


CTCTTACACT TGGTAAATGA GCAGAACGAC 


TCGTCATAGT 


AATTGGTAAT 


7200 


TTTGGATGAT 


GAGCAGCAAC ATAACGGACA ACATCTAAAC 


TGGCAAGGCC 


TGTACCAATA 


7260 


ATCGCAATAT 


CGTCCAGTTC ATTTACTTCG TCTAACGTAT 


TATATGTTGG 


ATAAGGCGTA 


7320 


gcGAXATATC CTTTTTTACC CTTTAAGTTA TATGGATCAT GGTAGGCAAA 


TGTACCACAT 


7380 


GTTAAAAATA 


CATAATCGTA CGCTTGCCAT GATTGTCCTG 


AATTTGTAGT 


ACATATGTAA 


7440 


TAAGTTAAAT TCGTTTCATC GATATTAGAA TTTGTATAAA 


TCTCTTGAAC 


TTTATTATAA 


7500 


TTAGTTGATA 


TATTTGGATA TTTTTTCGTG AACATAGATA 


AATAAGATTT 


CATATAATGT 


7560 


CCGAATACAA 


ATCTCGGTAA ATATGCAG 






7588 


£2) INFORMATION FOR SEQ ID NO: 143: 









<i) SEQUENCE CHARACTERISTICS: 

. (A) LENGTH: 10320 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
<D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 





n CTAGGTATT 


TTAAACCTAA 


TCTAGATAAA 


CTAGCTTCGT 


AAGCAGCTGC 


TACATTTTCA 


60 


5 


wViA V- LuAAA I 


CCTCAAAATA 


TAATTTTGAA 


GTAATAAATA AGTCTTCTCT 


AGCAATACCA 


120 




GTTGACTCCA 


ATCCGGCACG 


AATGCCAGCA 


CCTACTTGTT 


CTTCATTC CC 

W + £ W4 4 * WWW 


ATAAACTTTT 


180 


10 


GCGGTATCAA 


TACTACGATA 


TCCTTGTTCA 


ATGGCATACT 


X l\r\ \*J\ w X 1 


CATGCAATTT 


240 


TCATCATTTT 


CCACACGAAA 


TGTCCCTAAA 


CCAATTTGTG 




TCCATTATAA 


300 




AATGTTTTAA 


CCTCCATAAA 


TATCGCCTCA 


CCTTTTTGAT 


CIT ZkTT IT ZL C*C 
ulni IaXaww 


CTGTTATCAT 


360 


15 


AACAAATCTG 


AGTTGAATAC 


ATGAGAAAAA 


ACACTTAGAG 


r* h zitp & 21 r*n& 

vnnl LnnLWn 


CTAAAATTCT 


420 




AGTAATATCT 


CTCAAATATT 


AATCAAATTG 


TAAAAGTAAT 




TTATGACAAA 


480 




CTAAAAAAGC 


CGAAGTAACA ACATATAGTC ATCACTTCAG 


WW liviUil 1 1 


AATTGAATGA 


540 


20 


TTCAATTTTA 


TCCATCATTT 


GTTGTAAGTC 


TTCCACGTTG 


T Zk TTY1 Zk Zk T 21 /"* 


GACCATGGAA 


600 




TACAAATTTG 


TTAAAGAACT 


CGTCTAATTG 


TTCAGCACCG 


ik p zv 2i nr* a r*r*p 

HUviuLMl, J. A 


TGACAGCACT 


660 




ATTTTGATTA 


TAATTTGAAA 


TCGTTACATC 


GCCTTCATTT 


TT Zk A fl 2k TT & A 


AGTATAAAAT 


720 


25 


TGAAGTTGGT 


GTATATTTGG 


CACCTAATTC 


TTTTTGTAAG 




ATTGTTTAAT 


780 




CGCCTCAATT 


TGATCTGAAT 


AATTTACAAA 


TGATAATGAA 


V>ul X J. u .i wJ"i 1 


CATTTTGATC 


840 




CATCACAATA 


GTTTGCGGTC 


TAGATTTATC 


TAAATCCAAT 


GTATCAAATA 


CTTGTTCCAT 


900 


30 


TGGTGGTAAA 


TCTTTAAATT 


GACCGCCACT 


AATACCATTA 


TAAACATGAC 


CTTTTAACAA 


960 




TTGAGAATCA 


ATAATATAAA 


GACCAGTTCT 


TGTTAATACT 


AAATGACTAA 


TTCGTTCAAT 


1020 


35 


ATTATTAAAG 


CCATCCTTTG 


GTAAAAAGAT 


ATTTGCCATA 


ATGTGCATAT 


CTTCTGGTCG 


1080 


AATTCGTTTT 


TCTTTAACTA 


ATCTTTCACG 


AATACCAATT 


AATCTCATGT 


CCGTTACATA 


1140 




TTCACTATGA 


TTTTTCGAGA 


ACAATTTTAA 


TGCGTCAATC 


TCACGATCTT 


TTGTACTAAC 


1200 


40 


CATGTGATTA 


TAATCTTCTT 


GTTGTTTTGT 


AATTGTCTTT 


TTATTTTGAA 


TACGCTCTTT 


1260 




CTCTAAAGCT 


TCTTCATGAG 


ACTTTTTAAT 


GTTTTGTTCT 


TGTTGTTCAT 


ACTTTTCTTC 


1320 




TGTTTGTCGC 


TTAACTTTTT 


TCTTACTACC 


TAAGGCAACT 


AAAAAAAGGA 


CAAAAAAGAT 


1380 


45 


TAATGCAATG AgCTACTGCA ATAATGAGTC CAATGACTAT CGGTGAAGAT 


AAATCCATCA 


1440 




CAACAACGCT 


CCTTTTTAAT 


ATATGAATAA 


CTTTAATTAT 


AATAGAaAAG 


CTAAAGATTT 


1500 




TCGATACATA 


TTATCATTTA 


TATACCGAAA 


ATCTTTTATT 


TAGCTATATT 


CAATTCATCT 


1560 


50 


TATTATTTTA 


CTGCGTCTTT 


TAATTCTTCC 


ACTTTGTCTA 


ATTTTTCCCA 


TGGGAATAAG 


1620 




ACATCTGTAC 


GTCCAAAATG 


ACCATAAGCA 


GCAGTTTGTT 


TGTAAATCGG 


TTGTTTCAAA 


1680 
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AGTTGCCCTT CAGAAACTTT ACCTGTTCCA AATGTATCAA TTGCAATTGA CACTGGTTCT 1800 

GCAACACCAA TCGCATATGC CAATTGTACT TCACATTGAT CTGCTAAACC TGCTGCAACA i860 

ATATTTTTAG CCACATAACG TGCAGCGTAT GCAGCTGAAC GGTCTACTTT TGTAGGATCC 1920 

TTACCACTGA AGCATCCGCC ACCATGACGT GCATAGCCAC CGTACGTATC AACAATGATT 1980 

TTACGTCCTG TTAATCCTGC ATCACCTTGA GGTCCACCGA TTACAAAGCG TCCTGTAGGA 2040 

TTGATGTAGA ATTTAGTTTG TTCATTAATC AAGTTTTCTG GAACAGTTGG ATAAATGACA 2100 

TGTGCTTTAA TGTCTTCTTG AATTTGTTCA AGTGTCACAT CCTCAGCATG TTGTGTTGAT 2160 

ACGACAATCG TATCAATACG TACTGGGTTA TCATTTTCAT CATATTCAAC AGTGACCTGA 2220 

ACTTTACCGT CTGGTCGTAA ATAATTTAAC GTACCATCTT TACGCACATC TGATAAACGT 2280 

TTTGCCAATT GATGTGATAA ATAAATTGCT AGAGGCATAT ACGTCTCTGT TTCATTCGTT 2340 

GCGTAACCAA ACATTAAACC TTGGTCACCT GCACCTGTTG CTTCAATTTC TTCTTCGCTA 24 00 

TCTTTATCAC GATACTCTAA TGCTTTATCC ACGCCTTGTG CAATGTCAGG TGATTGTTCA 24 60 

TCAATCGCAG TTAAAATTGC CATTGTTTCA TAATCATAAC CATATTTTGC TCTTGTGTAT 2520 

25 CCAATTTCTT TAATTGTTTC TCTAACAACT TTCGGAATAT CAACATATGT TGTTGTAGAA 2580 

ATTTCGCCGG CGATCAATGC CATACCTGTT GTAACAGTTG TTtCACAAGC TACACGTGCA 2640 

TTTGGATCGT CTTTTAAAAT AGCATCTAAT ATTGCATCTG ACACTTGGTC AGCGATTTTA 2700 

TCTGGGTGTC CTTCTGTAAC AGACTCTGAA GTAAATAATC GTTTGTTATT TAACATAGTT 2760 

TGCTCCTTTA AATTTATATT ACGAAAATTC TCTCTCTGTG AGCTAAATAA AAAAGACCTT 2820 

CTAACTATTA ATATAGAGAG AAGGCCTAAT ACGTCCATTC GCTCTTATCG TTCAGACCTA 2880 

TTTGTCTGCA AAcGGTTTGG CACCTTTCTT TTATAAAAAA GAGGTTGCTG GGTTTCATTG 2940 

GGTCCATGTC CCTCCACCAC TCAGGATAAG AGAATCCGTT AAAAATAATA GTACCTAATT 3000 

AATGAATTAA TGTCAATTTT TCACAAATAA ATTTACAGTA AAATATTGTA GATTAATTAT 3060 

GTTAATGTGT TATACTAATT AAATGTAAAG GCTTACATTT AAATTATCGC TTTGGAGGGA 3120 

TTTAGGATGT CAGTAGACAC ATACACTGAA ACAACTAAAA TTGACAAATT ACTGAAAAAA 3180 

45 CCAACGTCAC ATTTTCAACT TTCGACGACA CAACTTTATA ATAAAATCTT AGACAATAAC 3240 

GAAGGGGTAT TAACAGAACT TGGTGCTGTT AATGCAAGTA CTGGAAAATA TACTGGTCGT 3300 

TCGCCTAAAG ACAAATTTTT TGTCTCTGAA CCTTCATATA GAGATAACAT TGATTGGGGA 3360 

50 

GAAATTAATC . AACCTATCGA TGAAGAAACT TTCTTGAAGT TATACCATAA AGTACTAGAC 3420 

TATTTAGATA AAAAAGATGA ACTATACGTA TTTAAAgGcT ACGCTGGTAG CGATAAAGAT 3480 
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ATGTTTATTA GACCTGAATC AAAAGAAGAA GCTACAAAGA TTAAACCTAA CTTCACTATC 3600 

GTTTCTGCAC CACATTTTAA AGCAGATCCA GAAGTTGATG GTACTAAATC TGAAACCTTT 3660 

GTCATTATTT CATTTAAACA CAAAGTCATT TTAATCGGCG GTACTGAATA CGCTGGTGAA 3720 

ATGAAAAAAG GTATCTTCTC TGTAATGAAT TATCTCTTAC CGATGCAAGA TATTATGAGC 37 80 

ATGCATTGCT CAGCAAACGT TGGTGAAAAA GGCGATGTTG CATTATTCTT TGGTCTATCT 3840 

GGCACTGGTA AAACAACCTT ATCGGCTGAC CCACACCGTA AACTAATCGG TGATGATGAA 3900 

CACGGCTGGA ATAAAAACGG GGTCTTTAAT ATCGAAGGTG GCTGCTATGC AAAAGCAATT 3960 

AATCTTTCCA AAGAAAAAGA ACCACAGATT TTTGACGCAA TCAAATATGG TGCAATTTTA 4020 

GAGAACACTG TAGTTGCAGA AGATGGTTCA GTGGACTTTG AAGACAATCG TTATACAGAA 4 080 

AACACGCGTG CCGCTTATCC AATTAATCAC ATTGACAATA TTGTAGTACC ATCTAAAGCA 4140 

GCACATCCAA ATACAATTAT TTTCTTAACT GCGGATGCAT TTGGTGTTAT TCCACCGATT 4200 

TCAAAGTTAA ATAAAGACCA AGCAATGTAT CATTTCTTGA GTGGTTTCAC TTCTAAATTA 4260 

GCTGGTACAa GCGTGGTGTG ACAGAACCTG AACCATCATT CTCAACATGT TTCGGAGCAC 4320 

25 CGTTCTTCCC GTTACACCCT ACTGTTTACG CTGATCTATT AGGTGAACTT ATCGATTTAC 4 380 

ATGATGTTGA TGTTTATCTT GTTAATACTG GATGGACTGG CGGAAAATAT GGTGTAGGAC 444 0 

GTAGAATCAG CTTACATTAC ACACGTCAAA TGGTAAACCA AGCGATTTCT GGCAAATTGA 4500 

AAAATGCAGA ATATACAAAA GATAGTACGT TTGGTTTAAG CATTCCTGTA GAAATTGAAG 4 560 

ATGTACCGAA AACAATTTTA AATCCAATTA ATGCTTGGAG CGACAAAGAG AAATATAAAG 4620 

CACAAGCAGA AGATTTAATT CAACGTTTTG AAAAGAACTT CGAAAAATTT GGTGAAAAAG 4680 

TTGAACATAT TGCTGAAAAA GGTAGCTTCA ACAAATAAAT TTGAATACTA AATCaAAACC 4740 

ACCdGTGTGA ACGGGTGGTT TGTTCTGCGG CTATAAGCCT TCCTTACTGG CCAGCCCTAA 4 800 

AAGGGCACTG ACAAGTCAGC CAACTGCACT ACTATTCCAG CAACCCTAAA GGGTTACTCT 4860 

TTTTTCTTTC TTTTTTTATT TTTCTCTCCA GTGAAAGGAT CTAAATATTC TTCCATTGAG 4 920 

ATTTGGTCTG CAACGATATC CTCTTGTAAT TGATTACGAA TATAATTTTC AATCACTTTT 4980 

45 TTATTTCTAC CTACTGTATC CACATAAAAT CCTTTACACC AAAACTTTCT ATTTCCATAT 5040 

CTATACTTTA AGTTAGCATG TCTATCAAAT ATCATTAAAC TACTTTTTCC TTTTAAATAG 5100 

CCAACAAATG ATGATACCCC AAGTTTGGGT GGTATACTAA CTAACATATG GATATGATCT 5160 

TTACATGCCT CTGCTTCAAT TATCTCTACA CCTTTTCTTT CACATAATTG ACGCAATATA 5220 

ATCCCTATAT CTTTTTTTAT TTTTCCATAT ATCACTTGTC TTCTGTATTT AGGTGCAAAG 5280 
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AAATAGCATC TCCTCGTGTT GATTATTTTG GTTGGCTGAC CAATATTTAT TCTAGCACGT 5400 

AGAGATGCAT TTTTTGTGAC AATGGTAGAA CCTTTTCtGa ACCATACGCA TAGCGTATGG 5460 

TTTTCTTTTT ACAATTAAAG AGCCAACCGT TGTTATAGTC TAACAATGGT TGGCTCCTCT 5520 

TATTTTATGT GCTAAAAATT TATAGGCAAT TTTATTACAA GAATGTACAT TTAAGGTGAC 5580 

CTTCATGCCA. AAATCGCATC ACTCATTTAA TGGAAGCAGC ACGTCTTCAT ATAAAGTACC 564 0 

GATCCCTAAT TCAACGCATG TAGTACCACA TCTTCAAAGC TTGATAGTTC CCATGCGCAC 5700 

ACCACGTTTC ATACTAGCTA TGCGACTCAA CTTGGTTCAT AAACTCTTTA ATATAAGTCA 5760 

ATGTTTCAAC CATCGCTGGT GGTCTTGGCA CATGTCCTTC TGCCATTTGA TAAAATGTTT 5820 

GATGCGTGGC ACCTTTTAAC TCTAGTTGGT CCGCTAAATA ATACGCATGA TGAATACCAA 5880 

CTTGCTGGTC TTTCCCTCCA TGTACAATTA ATATTGGCGG ACTGTTTTCA TTAATGTTTG 594 0 

20 GAATCGCTTG GCGTGCCTCA TATGCCGCTC GATCTTTTTT CGGATGACCA ATCATTCTTC 6000 

GTAGCATGCC TCTTAAATCG ACACGTTCTT CATACATTAA ATCAATATCT GAGACACCAC 6 060 

CCCAGATTGT ATAACTTGTT ACTGGTAAGT CTTGAAATGT CAACAATCCT TGTAAACCAC 6120 

25 CTCGCGAAAA ACCAACCATG TGGATAAATG CATGTGGATA TTTATCATGT AGCAACCTTA 6180 

ATAATTGCGT CACATCATTT AAATCGCCAC GGTAAAATTC GTCTTTGCCT TCACTCCCAT 6240 
TGTTACCTCG GTAGTATGGC CCAATCACTA AAGTTTGACT ATCTGAAAAT TGCATTAATC . 6300 

TACCTGCGCG CACACGTCCT ACTTGACCTT TGCCACCTCG CAAATAAACT ACAATGCGAT 6360 

TTACTTCATG ATGTGGTGTC ATCATTAAAG CTTTTACTTG TAAGTCATCT GACAAATATG 6420 

TAATTTCTTC GAATTGATGC GTAAAATATT CAATTGGCAT TCGTTTACGT TTGATAAAAC 64 80 

CCAAGTGATT GCACCCTCTC TACGCATTTT AAAATGGTAC TATCTTGCAG TAAGAAACTC 6540 

CGTTGTGCGA GTTCAATATC ATTGATACAG TTAAACAACA CTGGCCCTGC TGTTTCTAAA 6600 

TAATCGTTCT TGCTTACCAA TGATTCAACT TCGATAAAAT ATACATCTTT TACAAAATCA 6660 

GTTTGATCAT GTGTTTCAAT GGTATATTGT GCTATGTAAT AAATATTTTT AACTTTGGCG 6720 

CCTGTTTCTT CATATAATTC aCGTGTAACT GCTTCAGCAC TACTTTCCCC GCGTTCCCTT 6780 

45 TTACCACCAG GAAATTCAAT CCCCCGTAAA TTATGTTTGG TAAAAAGCAA TTGATTTTTA 6840 

AACGTTGGAA TAGCTAGCAC ATGATTGCCA TCTGCTATCT CATTATCCTT TTTAAATGTC 6900 

AAATTAACTT GACGATTATC TTTATCCCTA AACTTCACGC GCATCACATC CCTACATTGT 6960 

50 

ATGTTAATAT AATAGTTAAT TACTATCGTT GGAGGCATTA ATTATGAAAA AGATATTCTT 7020 

GGCGATGATT CATTTTTATC AACGTTTCAT • TTCGCCACTC ACTCCACCAA CTTGTCGTTT 7080 
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CCTTTATTTA GGTATCCGTC GTATTTTAAA ATGTCATCCG CTTCATAAAG GCGGCTTTGA 7200 

CCCTGTTCCG TTAAAAAAAG ACAAGTCAGC AAGCAAGCAT TCACATAAAC ATAACCATTA 7260 

ATATGGTTGT AATTGAGTTA TATCCACTAA AGGGGGGCGA AATTCGAGTC GCCCCTCTTT 7320 

TAATATGCCT GAATGCGCCA CCACATCTTG TTCAAAATAA TAACCTGCTG GTGTAACATC 7380 

TCCTGGATAA TCACCTTTAC GAGCAAGCAT CGCTGTAAAA TAGCGGCTTA AACCATATTC 7440 

GTACATGCCG CCAATAACCA CTTTTGCACC ATGACTTTTC AAAGTATCAA TTGCCGTTTG 7500 

CACTTTATCA ATGCCACCTA GACGAAATGG TTTTAATACA ACAACTTTCA CATTGTATAA 7560 

TTCTATCAAA TTAATTATGT CCaACAACGA TGTTGCCTTT TCATCAAGGG CTATTGGAGG 7620 

TATTGTTCCA TCCGCTACTT CATCAAGCAT GGAGATATCT TTAAATGGCT CTTCGATATA 7680 

AAGAACCTGT TCACGCGCTA ATAACTGTAA CTGTGTGAAA TCTTGACGAT CCAAGGACTC 7740 

ATTTGCATCT ATAACCAATT GAAAGTGAAA GTCTAATTCC CGTAACACTC TAATTTGATG 7800 

CATGATTTGA GGCGTCCATT TTAATTTAAT TCTGGTCGGC TTTGTTGCTT TTAATGACTC 7860 

TAGTTGTTTA TTTGATAAGC CGCTCGcTGT CGCTCCATAT GCTACTGAAA ATGAAGGCAG 7920 

25 TACATGAAAC ATTTGATACA ATGCCATGAC AATAGTTGCC CTTGCAGCAG GCGTATTTTC 7980 

CAATGAATCT ACTAATTTTA GTGCTGCTTC ATACGTTTCA AATGATTTAT TTCTATTATC 804 0 

TTCGAACCAT TGCTCAATTA CATGTTTCAC TGAGGCAATT GTTTCATGAT CATACCAATC 8100 

TGTTTGAAAA GCGTTACATT CCCCGAAATA TGCATTTCCT TTGTCATCAA TCAATTCGAT 8160 

AAACAAACAA TCACGATGCG TTAAAGTGAC TTTCGGTGTT ACAATTTGTG ACTTAAATGG 8220 

CTCACTATAT TTATAAAAAT GCAAAGCTGT CAACTTCATC AAATCATCCT CTATACAACT 8280 

TATTTCTTTG TAATTTACCT GTTGATGTAT AAGGTAAAGT ATCAACCTTT TCAAAGTGTT 8340 

TCGGTACTTT ATATTTCGCT AAATGTTGTG ATAAATATGC AATCAATTGT GCCTTTGAAA 84 00 

TGTGACTTTC ACTGACAAAA TATAATTTAG GCACTTGGCC CCAAGTATCA TCAGGATGCC 8460 

CTACACATAC TGCGTCACTG ATACCTGGAA ATTGctTCGC TACCGTTTCA ATTTGATATG 8520 

GATAAATATT TTCACCGCCA CTAATAATTA AATCTTTACG TCGGTCATAA ATCATGACAT 8580 

45 AACCTTCATG ATCTATTTCA GCAATGTCAC CCGTATTAAA ATAACCATTT TCAAACGTAC 8640 

CCGTTAAATC TGTTGGATAC AAATATACAT TCATCACATT GGCGCCTTTA ATCATTAATT 8700 

CTCCATGACC TTCTTTATTA GGATTTTTAA TTTTTACGTC AACATTGGCA CTTGGCATCC 8760 

50 CTACAGTGTC AGGACGTGCA TGCAACATTT CCGGTGTTGC TGTTAAAAAT TGCGAACATG 8820 

TCTCAGTCAT ACCAAATGAA TTATAAATTG GCAGGTTATA TTGTAATGCC GTCTCTATCA 8880 
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AACCTTGTTG CATAAGCCAA TTTAAAGTTT GTGGCACAAG 
CATTTTTAAT CATCGTTAAA ATTTGTTCGG CATTGAATTT 
AACCTTCAAT AACAGCTCTT AAAAGTACAC TGAGACCCGA 
CAGATAGCCA ATTAGTGTCA CGATCAAATC CCAAGCTCTC 
CATAATGATT ACGAAACGTT TGTGGCACCG CTTTTTGAGG 
ACATAATCGA TGCAATGTCA TCTAAATTAA ATGATGTATT 
CTTTCGGCAC CACAGTTTCA TTCGATGTTT CATATTGGAT 
AACTGTTCGT TGTAATATCC CTTCCAGCGA ATTCAATATC 
ACCCTCGTAA TTCCAGTGGC AAGGTACAAA AAATCAATTG 
GATTCGTCAT CTCATTAGGT GTCAACCTTG TATTAATCAT 
ACCAACATGC ATGTATTAAA ATGATCGATT GAATCGAATT 
GAGATTGTTG ATAAGCCTTG AGTCTTTTAG CCAATAGACT 
GATAAGTATA AGATTCTTGA CCGTCTGTTA TCGCAATATG 
GTTTATATAA CCAAAAGTCC ATGCGTTATT CCTCCAAAAT 
CGATTTTATG ACATTCTAGC AGTGGTTATG TTTAAAAATA 
TGCATTGATA TGATTGTTAT AATGCTCAAT ACATATCGTT 
TCAGTTATTT TTATTTAATT TTAGTGTCAT TCTGTCATTT 
TGTTGCCACA TCATCTGCAA TGTCAATTGG TATACGGTTC 
ATGGAATACT TCATCATCTA AATTTTCAAT GAGATATACA 
TTTATATTTT AACGTTTTCC AAAAGTCCGG CTTGCAATTC 
TTCAATAAAT AAGTAACGTT TGCTGCCTAC TTTGTCTATG 
TTCTATACCT CTTATATGTG CATAGTCTGC TGAAAAGTAA 
ATGTTGTTGT ATTTCAAATC GTTGGCCTAC TATTTTATTA 
(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1477 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 



CGAAATGTGC 
ATCAACAATG 
AATATGATAA 
TTTACATCCG 
GCCCGTTGTC 
TAATATGTTG 
ACCCATTGTG 
ATCCAGCGAT 
TACATCGATT 
CGCAATTTCA 
ATCTATGTAT 
CGCTTCACAG 
ATGTCCATTT 
CATTTACATT 
TAAAAAAGTA 
ATATCATTCG 
TGATGTGGTG 
ATGTCTTGTA 
TAATATGTTA 
AATACATTAT 
AAATATTTTG 
ATACTACCTA 
TTTGTGCTAC 



GTGATTCGTT 
CGCACAGTAA 
ATCGGCAAGA 
ATTGCACTGG 
CCTGATGTAA 
GACGGCGACT 
TTGTCCAACA 
ACAATTTGAA 
GACTTCATCT 
ATATTTGCCA 
AGCCCAACAC 
TATAAATTTT 
TGTTGTGCTT 
ATAATTATAA 
GACGAATTGA 
TCTACTATTA 
ATTTACCCAT 
ATGCACTTAA 
CCTTGTCCTT 
CCGGAATATA 
CAGTGCCTTT 
TTGTTTCATT 
nGGGGACTTA 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
94B0 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
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GTGTGGATTG 


GATTTTAAAA 


TCACCCTCAT AAATACTGTC 


ATCAATATGA 


TAAGTTACAA 


120 




TTTCACCTAT 


TATTAAATCA 


GCCCCATCTA ATACATCTCC 


AAGCAATATC 


ATTTGCGmTA 


1B0 


5 


GTTTACATTC 


GAATCTCATT 


TTCGCATCTT 


TAATTCCTGG 


CGTCTTAATC 


GTTGTAGATG 


240 




TTAAAAGTGA 


TAATTCTGTA 


CGACTCAACT 


CACTGTCACC 


ATATGCTAAC 


GGCGCTGCAG 


300 


10 


TCTCATTAAT 


ATCTTGAACA 


TTATCTTCGT 


CTGTAATATG 


CACAACAAAG 


TCTCCAGTCC 


360 


GTTCTATATT 


TAATGCAGTA 


TCTTTTCTCT 


TACCTCCTGC 


ACGTTGAACT 


GCAATAGCAA 


420 




TCATTGGCGG 


ATGATTATTA 


ACAATATTAA AAAAGCTAAA 


TGGTGCTGCA 


TTTACTGATG 


480 


15 


CATCTTGATT 


TAATGTTGTA 


ACAAAAGCTA 


TAGGTCGTGG 


AATAATTGAA 


CCAATTAATA 


540 




ATTTATAGTT 


TTCTCTAGCA 


■ linn «k mm 

GTTAATGATT 


GTGCATCAAA 


CGTATACATA 


ATACCTACCT 


600 




CTTTTCTAAG 


TATATCTAGG 


TATTTCTCCG 


ATTTTGGTTA 


ATTTAAACAT 


CTATTCTCCT 


660 


20 


CTGAAAATCA 


CTTGTATTTA 


TTTAGCAAAT 


CTTTTGAAAT 


ATGACACATA 


TGCATATCTT 


720 




CTGGATATTT 


TTCTAAATGT 


rr>/~* onv^ » nv^ Hurt 

TGCTGATGTT 


CTTCAGCACT 


TTTAATGTAG 


TTAGACAGCG 


780 




GTAAGACTTC 


CACTGCAATT 


TGATCTCTGT 


CTTTACGTCG 


TTCAATGAAC 


TGACGCGCTT 


840 


25 


CAATTAAGTG 


GTCATCTACA 


CAACTATATA 


AACCCGTTCG 


ATACTTTTGT 


CCAATATCAT 


900 




TTCCTTGTTG 


ATTCACACTG 


TAAGGATCAA 


TGATTTCAAA 


TAAATAATTC 


ATAATGTCTG 


960 




TAATTGTTAA 


CATACGATCA 


TCGAAATGAA 


GTTTGACACA 


TTCAGCATAA 


CCATCATACG 


1020 


30 


GACCGTCTAA 


TTTAGAGCTT 


CTTCCATTTG 


CTCTTCCTGC 


TTCTGTATGT 


ATAATTCCAG 


1080 




GTATTGTTGC 


AAAAAATGCT 


TCAACACCCC 


ATAAACATCC 


TCCTGCTACA 


TAAACAACTG 


1140 


35 


CCATATTTAC 


ACCTCATCAT 


CCTTTTTTAT 


ATTTTTAACA 


AGGTTATACC 


ATTTAATACC 


1200 


GCCATGACAT 


GATTCTGATA 


CACCTTCATT 


ACGATACCCA 


TATTTTTCAT 


AAAATGAAAT 


1260 


40 


TAATGATTCT 


CGACATGTTA 


ACGTTACACC 


ATGTCGATGA 


TGATTCTTAG 


CAAGAGTTTC 


1320 


AAAATAGTTT 


AGTAAGCGAC 


CTGCAATACC 


CTGACCTTGA 


TAATTTGGTG 


CTACAACAAG 


1380 




ACCTAACACA 


CTAATATAGC 


CACCTTCACT 


ATTATTTGTG 


GAGACATTTT 


TAAATAAATC 


1440 




ATCGCTAATG 


TAACGCTCTT 


TTATGACTGG 


ACCGTTG 






1477 


45 


(2) INFORMATION FOR SEQ ID NO: 145: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 3 976 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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AGGTGATTAT CCTAAAAATG CTCATGAGGT CGCTATTAAT GATAAGTTAG CTGCAGACAA 60 

CATTAGAGTC GGGGATAGAT TACATTTTAA AAATAATTCA ACTAGTTATA GAGTTTCTGG 120 

TATTTTAAAC GACACAATGT ATGCGCATAG TTCCATTGTG CTATTGAACG ATAACGGATT 180 

TAATGCATTG AATAAGGTTA ATACGGCATT TTATCCAGTG AAAAATTTAA CACAACAACA 24 0 

ACGTGATGAG CTTAATAAAA TAAATGACGT TCAAGTTGTG AGTGAAAAAG ATTTAACAGG 300 

TAATATTGCG AGTTATCAAG CAGAGCAAGC ACCGTTAAAT ATGATGATTG TTAGTTTGTT 360 

TGCTATTACA GCAATCGTTC TAAGTGCATT TTTCTATGTT ATGACGATTC AAAAAATATC 420 

ACAAATTGGC ATTTTGAAAG CAATTGGTAT TAAGACAAGA CATTTATTGA GTGCGTTAGT 480 

TTTACAAATT TTAACACTAA CAATAATTGG GGTAGGTATT GCTGTGATCA TCATAGTAGG 54 0 

ACTATCATTT ATGATGCCGG TAACGATGCC TTTTTACTTA ACAACGCAAA ATATTTTATT 600 

20 AATGGTGGGG ATATTTATAT TAGTAGCGAT TTTAGGTGCC TCACTATCAT TTATCAAATT 660 

ATTTAAAGTG GATCCTATCG AAGCAATTGG AGGTGCAGAA TAATGGCATT AGTCGTTGAA 720 

GATATCGTCA AAAATTTCGG AGAAGGTTTG TCTGAAACAA AAGTTTTAAA AGGTATTAAT 780 

25 TTTGAAGTGG AACAAGGGGA ATTTGTCATT TTAAATGGTG CCTCTGGTTC TGGGAAAACA 84 0 

ACATTGCTAA CGATATTAGG CGGATTGTTA AGTCAAACGA GTGGTACAGT GCTTTACAAT 900 

GATGCGCCAT TGTTTGATAA ACAGCATCGT CCTAGTGATT TACGATTGGA AGATATTGGT 960 

TTTATTTTTC AATCTTCACA TTTAGTTCCT TATTTAAAAG TGATAGAGCA ATTGACACTC 1020 

GTAGGTCAAG AAGCGGGAAT GACCAAACAA CAAAGTTCAA CAAGAGCAAT ACAACTTTTG 1080 

AAAAATATTG GTTTAGAAGA TCGCTTGAAT GTATATCCGC ATCAGTTATC TGGCGGTGAA 114 0 

AAGCAACGTG TTGCGATTAT GAGAGCATTT ATGAATAATC CGAAAATCAT TTTAGCAGAT 1200 

GAGOCCACAG CAAGTTTAGA TGCCGATAGA GCAACAAAAG TTGTTGAGAT GATACGTCAA 1260 

CAAATTAAAG AACAACAAAT GATTGGTATT ATGATTACAC ACGATCGAAG ATTATTTGAA 1320 

TATGCAGATC GAGTGATTGA ATTAGAAGAT GGCAAAATAA CTGATTAGTG GCTTGTAAAG 1380 

ACGCTAAATG TTAATGATTT AAGACATAGT AGTATAAAAG TTAGATAACA GAATACGATT 1440 

45 TGGGTTTACA AAAAACAGGC TGGGACATTA AGTTCTTAGG CAATGTAAAA AAGCTGATTT .1500 

CTATTAATTA TTTGATAGAA ATCAGCTTTT TTGATATGTA TTTTATAATG TACAGCTCGT 1560 

TGCATTCATA TAGCTTGAAG TCACGTTTAA AACCATATCT ATCATTATGG TATGCATATC 1620 

50 

TTTTAAAACC TATTCTTTTG TTATTAGGAC ATATAAATTC ATCATTAAGT TCGTCATATT 1680 

TCCAATTTTG AGTGTTAAAA ATGTCACTTT TAAACTTTCT AGTTTTATCT TTAATAAACA 1740 
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CACTATCATA ACATGCATCA GCTACAATAT ACTCCGGTAA ATAACCGAAG nTATTTTgAA I860 

TCATTGTTAA AAATGGAATT AAAGTTCTAG TATCTGTTGG GTTTTGAAAT AGGTCATAGG 1920 

ATAAAACAAA TTGAGAATTT GTCGCTATTT GTAAATTGTA TCCTGGCTTA AGTTGGCCAA 1980 

AGTGTCTTAT TTTTTTAAAG TATTTAAAAG TAAAATTACA TGTTAATACG TAGTATTAAT 2040 

GGCGAGACTC CTGAGGGAGC AGTGCCAGTC GAAGaCAGGG GCCCCAACAC AGAArcTGAC 2100 

ATATAGTCAG CTTACAACAA TGTGCCGGTT GGGGTGGCTG AGACGGCACC CTAGGAAGGG 2160 

ACCCGTCATC AAAAATTCTA TTTATAGAAT TTTACAGTAA TGTGCCAGAT GGGCATAGCG 2220 

AAgcCATTCA ATACGAAGTA TTGTATAAAT AGAGAACAGC AGTAAGATAT TTTCTAATTG 2280 

AAAATTATTT TACTGCTGTT TTTTTTAGGG ATTAATGTCC CAGACTCTTT AGTTTATTTA 234 0 

TTTTCAATAT AACAATTGTC TAATCAAGGA TTAACGAATA TTTAAAGATA GTTTGACGCA 2400 

ATATTAGAAA CAACCTATAA TAATAGTTTG TTTGTGGATT AACTATTATA AATAAAAGCG 2460 

GCGTAAAGAC ATATAAACCA ACTACTTGAA CAATATAACG TTAATAACAA TCTATACTGA 2520 

TACATTACGC CTAGATAATC TTTGATGAGC ACATGTAAGA AAAAGTGATA TGGTGTATGA 2580 

25 CTTCCGACAC CATCGATAGA TAAACCTAAT TTTTGGGCTA GTCGTAAGGC GCGCAATACA 264 0 

TGAAACTGAC TTGTtACACA AACAATTTTA ACTGCTTCAT GATACAAATT GTTGATGATT 2700 

TGTTTAGAAT ATAAAAAGTT TGTGTATGTA TTTATAGAGT GAGATTCCAT TAGTATATCT 2760 

GTTTTATCAA CACCATGTGC AATCAAATAA CGTTGCATAG CTAAAGCTTC AGAAATTGGT 2820 

TCGTCTGGTC CTTGTCCGCC AGATACAATG ATCTTTGTTG CTGATGCTTG TTGTTGATAG 2880 

ATATCAAGTG CACGATCTAA ACGCGCTGCA AGCATTGGTG TGACAAATTC GGTAAAAATA 294 0 

CCAGCACCTA ACACAATTAT GATATCAACT TCTTTGTTGT ATGATCTATG TCTATATGAT 3000 

ACTGTCCAAA CGAGATAACA AATAAAGGTT AGTAACAGGG AAAGACATAA TATAGCTAAC 3060 

CACATAGACA AACCTTTCAC AATAGGTGAC TGAATCGTAC TTATAAATAG AAGTGCTGAT 3120 

GTGTAGAGTA CAAATTTATA TGAAAAAGAT AATAATTTTT TAATAAATAA GCGACTAGAA 3180 

GTATGAGAAA ATAAATATCT ATGTTTGAAT AGCATGATAA TACTGATTAT TATAAATGTT 3240 

45 ACAAACATAG ACCAAGGGAA AGTATAGGTC ATGATGCTAT AGATGAGTGA CAAAAATATC 3300 

GATATGACAA CTAAGATGTA GCATGTTAAA TTTAACGTCA GAGTATAGTT GAAAATTAAC 3360 

GGACAAATAA CGATAAGTAT AAATATTAAT AATAAATTCA ATAACATACT GACACCTCGC 3420 

TTATAATAAA TATTAAATAT AAATGTAGAT GATTTAATTT ATTAAAGCAA GGAGAAAGCA 3480 

GCAACATGTA AATCTTAATT TGTTATATTA TATATGGGTC AATATTTTTG TGTTTTTTAG 3540 
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TATGGTAAAA CATTTAGAAG ACCATATTCA ATTTTTAGAG CAGTTTATAA ATAACGTTAA 3660 
CGCATTAACT GCAAAAATGT TGAAAGATTT ACAAAATGAA TATGAAATTT CATTAGAGCA 3720 
5 GTCTAACGTA TTAGGTATGT TAAATAAAGA ACCTTTGACA ATTAGTGAAA TCACGCAAAG 3780 

ACAAGGTGTA AATAAGGCCG CAGTAAGCCG ACGAATTAAA AAGTTAATCG ATGCTTAATT 3840 
AGTTAAGTTA GATAAACCAA ATTTAAATAT TGATCAACGT TTGAAATTCA TAACCTTAAC 3900 

10 

TGACAAAGGT AgAGCATATT TGAAAGAACG TAATGCGATT ATGACAGATA TTGCGCAAGA 3960 
TATTACTAAT GATTTA 3976 
(2) INFORMATION FOR SEQ ID NO: 146: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3346 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
20 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 



25 


GCTACCTAGG 


CATTTAAGAG 


ATCAAAAAAT 


GTATGAATAT 


GAACGTTATT 


TTTATGAGCA 


60 




AGAACTTAAT 


GGCGTTGATG 


aAGGGGAAAT 


TTTAAAGAAG 


TTAAAAGACC 


CACAAGATGT 


120 




TGCAGCTGAA 


ACAAAAGCTA 


GAAGTGTTAT 


TGATTATGCT 


GAATCTAAAC 


CAACATTTGA 


180 


30 


AAATATTTCA 


AGAGCTGTTG 


CTGCTTCATT 


AAGTTTAGGC 


ATTCTATCTA 


TTTTTGTCAT 


240 




CCTTATACCA 


GTATCTATAG 


TTGGATTATT 


TGTATTAGCA 


TTATTTTTAA 


TATCACTTTT 


300 


35 


GCTGCTGTTT 


TGTCCAATTA 


TTTTATTAGC 


ATCAGCAATA 


TCCAGAGGAA 


TTGTGGACTC 


360 


AATTAGTAAT 


GTATTTTTTG 


CCATATCATA 


TTCAGGATTA 


GGATTAGTAT 


TTATCATTGT 


420 




CATATTTAAG 


ATTTTAGAAT 


ACATTTATCG 


TTTAATCTTA 


AAATATTTAC 


TTTGGTATAT 


480 


40 


TAAAACTGTC 


AAAGGAAGCG 


TTAGAAAATG 


AAGAAATTCT 


TTTTTATTGG 


GCTTTTAGTG 


540 




TTTGTTGTCT 


TTTTTACAGC 


AGCAACCATT 


ATTTGGTTCA 


GCTATGATAA 


AAACAAATAT 


600 




GGTACTAAAC 


AATATGATAA 


AACATTCAAA 


gACGATGCTT 


TTGACAATGT 


ATCTATAAAT 


660 


45 


TTGGATAGTA 


CAGAACTTCG 


TATAAAACGG 


GGGAATCAAT 


TTAGAGTTAA 


ATATGATGGT 


720 




GACAATGATA 


TATTAATTAA 


TATAGTAGAT 


AAGACGTTGA 


AGATTAGTGA 


, TAAAAGGTCT 


780 




AAGACAAGAG 


GATATGCAAT 


TGATATGAAT 


CCTTTTCATG 


AGAATAAGAA AACGTTAACG 


840 


SO 


ATTGAAATGC 


CTGATAAAAT 


GATTAAACGT 


TTAAATCTAT 


CATCTGGAGC 


AGGAAGTGTT 


900 




AGAATCAGTG 


ATGTTGATTT 


AGAGAACACA 


AGTATTCAAA 


GCATTAACGG 


TGAAGTAGTT 


960 
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AGTAAAAGTA ACATTAAAAA TAGCAATATT AAAGTTGTTA TTGGTACGCT ACAAATCGAC . 1080 

AAGAGTCAAA TTAAACAATC CATATTTTTA AACGATCATG GTGACATTGA ATTTAAAAAC 1140 

ATGCCATCAA AAGTAGATGC AAAAGCTTCT ACTAAACAAG GAGATATTCG TTTTAAGTAT 1200 

GATAGTAAAC CTGAAGACAC TATACTAAAG CTAAATCCGG GAACGGGTGA TAGCGTAGTT 1260 

AAAAATAAAA CATTTACTAA TGGtAAAGTT GGGAAAAGCG ACAATGTTTT AGAATTTTAT 1320 

ACGATTGATG GTAATATCAA AGTTGAATAA ATAAAGGATG TAAGCACCGA TATTAGGAAG 1380 

CATAATTTCT CTAATATCGG TGTTATTTAT TTGTTGGCAA AAGTTAAGTC GGTATCTATA 144 0 

TTGCCAGTAA AGTGAGTGAT ATTAAGGTCT TGACCATCTA ACCATGATTT GAAATCTATT 1500 

ATTTCTGGTG GCGCATTTTC TCCCAATGTA AAATATGCAG TTAATGTTTC AGGTTGATAC 1560 

ATTGATGTAT GGATGGTGCC AGACCAGCTT TTGAATAGTT TACTGTAAAT TTCATACTGA 1620 

20 GGATTATTGA ATAACTTAAA TGCTGTAGTC ATATCTAAAT TATCATTAGT TTGTGAAATG 1680 

GTACGCGCCA GTCTTTCTTT AGATTCTTTT GTATAATTAC GATTTTCATG TGTTAATATT 174 0 

TCAAAATGAT TTGTACATAT ATTATCATAA CGAACATCTA TTGATCTCGG TGTCACTTCA 1800 

25 ACAATTGCAT GGTTCAATGA TTTGTCCATC AGTATGTAGC TAAATGAGCT TCTGTGTGGT 1860 

ATTTCTTTCA ATAATTGGAT TGCTTCTGTT ACATTTCGGC AATTTTCAAG AATTAGACGA 1920 

CCAATCATAT AACATACAAA ACCATTTGCT GGTTTCTTCC GGTGCATAAA GTTATAGCCC 1980 

ATAGTTAATC CTGACTCATT CATACCATCC ATTCTTCCAG TTACCCTTGA TACAGGACCA 204 0 

ATTTGAGCTA AACCGCTATC TGTAGGTTGA TAAAGTAAGT AGCGACCATC ATAAGTTGCA 2100 

GGGTGGTAAT CATAATTTCT AACCATGAAG TCTTTGCCTT GAAAGACCGT GCAaCCACTT 2160 

TCTTTTAAAT CGGTAAAACG ATAATGTCCA AAGTTTAAAA TAATTTGGCG TGTTGGCATT 2220 

TTGAGTATAC TTTGTAGTCC CATTAATTCT TCCCATATTT GAGGTGCGTA TGTTTGGAAT 2280 

ATTTGATAAG TTTCATTTAC ATCTATATCG AAACGTGGGA CaCnTTTTTT CCATTCTTTT 234 0 

TCTCGATTTT TTAGAAGAGG TGTTTGTTGA AGCCATTTAC CAGTTTTAAC ACCTAACTCG 2400 

AAATGTGAAC CTCTAAAAGT CATGATATCT GATGTCACTT GTTGCATATC ATCGGCCCCT 2460 

45 TTCTTTTTAG TTGTAATATA TTGTAAATAA ATAGTAATCG TATGTATATT GAATGTCATG 2520 

TTAAATAAAG TTATATTTTA CTAAATGAAA TATAAAATTG TTTGAGGTGA TTTCTCGGTG 2580 

TATAAGACTT ATCAATCAGT TAAAACATAT TTTTATAGAT GGTGGGGATA TTGAGTTAAA 2640 

50 

AACTTAAAAT CATCTTATCA TAAATATCAA TCTTAAGTTA GCATTCACGA TAATAGTCAT 2700 

TGTTAACATT AGCATATAAG GTCATGTCAC GTTGAAACAG AGGTTCCTCG GCATTTTTGA 2760 
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TTATTTAATG ATTATTCTAT ATATGATAGT ATAATGAAAT GTAGATAGGT ATTTAATTTA 2B80 

ACAGAGGTGA AATTGAGATG TGGAATTTTA TTAAATGtGT GkTTAAATTC GTATTTAGCT 2940 

TAGTTGCTAT TACAACATTA GTTGCTGGTG TTGGTGTAGT AGCATTTGCT TATATCTTTA 3000 

AAAAAGATTT TGAAGATATT GAAAGAAAAA CTAAAGAAAT TATTTCTGAT ATTGAAAGTA 3060 

AAAATAACTA ATAACATTTA GAGGCTGGGA CATAAATCCC TAAAAAACAG CAGTAAGATA 3120 

ATTTTCAATT AGAAAATATC TTACTGCTGT TCTCTATTTn ATcAmTACTt CGTATTGAAT 3180 

GGCTTCGCTT TCCTAGGGTG CCGTCTCAGC CTTGGTCTTC GACTGGCACT GCTCCCTCAG 3240 

GAGTCTCGCC ATTAATACTA CGTATTAACA TGTAATTTTA CTTTGGAAAT ACTTTTAAAA 3300 

AATAAGACAC TTTGGCCCAA CTTGGCACAT AAATGTAAAA TTCAAT 3346 
(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

GTTGAAGAAA GAAATATAAC AGTCAATTAT AATTATAACC TTGTTGAAAT CGACGGTGAC 60 

AAAAAAGTGG CTACATTCGA ACATATCAAA GCATACGATA GAAAAACAAT AAGTTATGAT 120 

ATGTTACATG TAACACCACC TATGGGTCCC TTAGATGTAG TAAAAGAAAG TACACTTTCA 180 

GATAGTGAGG GTTGGGTAGA TGTTAACCCA ACCACATTAC AGCATAAAAG CTACTCTAAT 24 0 
GTATTTGCAC TTGGTGATGC TTCAAATGTA CCTACTTCAA AAACAGGCGC ACTATTcGTA ' 300 

AGCAAGCACC TATCGTCGCT AATAATTTAT TGCAAGTGAT GAATAATCAA ATGTTAACGC 360 

ATCATTATGA TGGTTATACT TCATGCCCTA TTGTTACTGG ATATAATAGG TTAATACTTG 420 

CAGAGTTTGA TTATAATAAA AATACTAAAG AAACAATGCC GTTTAATCAG GCCAAAGAAC 480 

GTaGAAGTAT GTATATATTT AAGAAAGATT TATTACCTAA AATGTATTGG TACGGCATGC 540 

45 TAAAAGGATT AATATAATAA AGTACAGAAA ACAATAAATT TTTAATGAAA AATCTTTTAC 600 

TATAAAAGAT TAAGTATTTA AATGACGTGT CAGTGTTGTG TTTATATGTC GTGAATTTTT 660 

AGCTCTAAAT AGTATAAGAT TGAAAAAGTT GTTACTGTTT TAAATGATCA CGATGAAGTC 720 

50 ATTCAATAAG AATGATTATG AAAATAGAAA CAGCAGTAAG ATATTTTCTA ATTGAAAATC 780 

ATCTCACTGC TGTTTTTTAA AGGTTTATAC CTCATCCTCT AAATTATTTA AAAATAATTA 84 0 
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AGATATTCAA 


ACCACGTGTA 


CTCAAAATGA 


TAGCTTGGTA TGTACCTCCA ATAGTAATTT 


960 




CAATAACTTT 


GTCTGTTGAA 


CACTAAGAGC 


AATTTTAATT 


TCATAATGTG 


TTGTAAACAT 


1020 




TTTTTTTGAT 


TGGAGTTTTT 


TTCTGAGTTA 


AACGATATCC 


TGATGTATTT 


TTAATTTTGC 


1080 




ACCATTTCCA 


AAAGGATAAG 


TGACATAAGT 


AAAAAGGCAT 


CATCGGGAGT 


TATCCTATCA 


1140 


10 


GGAAAACCAA 


GATAATACCT 


AAGTAGAAAG 


TGTTCAATCC 


GTGTTAAATT 


GGGAAATATC 


1200 


ATCCATAAAC 


TTTATTACTC 


ATACTATAAT 


TCAATTTTAA 


CGTCTTCGTC 


CATTTGGGCT 


1260 




TCAAATTCAT 


CGAGTAGTGC 


TCGTGCTTCT 


GCAATTGATT GTGTGTTCAT CAATTGATGT 


1320 


15 


CGAAGTTCGC 


TAGCGCCTCT 


TATGCCACGC 


ACATAGATTT 


TAAAGAATCT ACGCAArCTC 


1380 




TTGAATTGTC 


GTATTTCATC 


TTTyTCATAT 


TTGTTAAACA ATGATArATG CAATCTCAAy 


1440 




ArATCTAATA 


GTTCyTTGCT 


TGTGTGTTCG 


CGTGGTTCTT 


TTTCAAAAGT 


GAATGGATTG 


1500 


20 


TGGAAAATGC 


CTCTACCAAT 


CATGATGCCA 


TCAATACCAT ATTTTTCTGC 


AAGTTCAAGT 


1560 




CCTGTTTTTC 


TATCGGGAAT 


ATCATCGTTA 


ATTGTTAACA ATGTGTTTGG 


TGCAATTTCG 


1620 




TCACGTAAAT 


TTTTAATAGC 


TTCGATTAAT 


TCCCAATGTG 


CATCTACTTT 


ACTCATGCGT 


1680 


25 


TTGATAAAAA 


CTTAAATAAT 


ATTAATTCGG 


TCATCAGTGG 


CGTTAAATCT 


TTTATCATTT 


1740 




TTAGTTATAG 


TTGATAAATT 


TATATTTATA 


AGCATATATG 


GATATTTCAT 


CAAAAATTTT 


1800 




TATTTATATA 


AATCCGAACT 


GCATACATAT 


TTGTTTAAAT 


AAGAGGTATT 


ATTTTTCGGG 


I860 


30 


AAATTGCTGT 


CTGAGTTAAA 


AGGATTAGTT 


TTATAAAATG 


AGTTGAACTA 


TAGCCAAAAA 


1920 




CGATTAAAAT 


ACTGATAATC 


CATTTTTGtA 


TTATGTTAGG 


GACTTTTTTA 


CTTAATTTTA 


I960 


35 


ACCCTATTGG 


aGCmAATATA 


ATACTCCCTA 


TTATAAGGAA 


TAAGGCGTCA 


TATAAaGGGA 


2040 


TATAACCTTG 


AATAAGTTTG 


ATGACAAAAG 


CACCAATTGA 


AGATATAAAA 


GCAATTACTA 


2100 




TACfXTTAGC 


GACTACAGTA 


TTCATTGGTA 


ATTTGAATAA AACCAATAAT 


ATAGGAATAA 


2160 


40 


TAATGAAGGC 


ACCACCTCCA 


CCTACTATAC 


CTGAAATAAT 


ACCAATGAAA 


AGGCCAATGA 


2220 




TAACTAATAA 


ATATTTATTA 


AATGAAGACT 


TTTCGGAACT 


AGGTTtCACT 


TTAATAAACA 


2280 




TTAATGTTAA 


TGCAAGTAAA 


GCAATAATGA 


TATATACCGT ATTTACAAAT 


GTAGCATCAA 


2340 


45 


ATAAATTTGC 


TAGAAATGCA 


CCTAACATAC 


TCCCT 






2375 



(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



55 



757 



EP 0 786 519 A2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

GAGGTTTCTA GACAAGCTTT TAATAACTTA CCAAACTCAT TAAgrTGGTT gTGtTGGACT 60 

5 GCCtATTATC mAAGtATTAT GaGTTGTTTA ATATTAGtGC TAArACATAC GAAGAGTGGT 120 

TTAAACAATT TAGTAGTAAG AAAGCACAAT TCAGTATTAA TCTCACGGAT AAATGGATAA 180 

TTCAAATCGC ATATGGTAAA TTAATAATAA TGGCTAAAAA TAATGGCGAT ACATATTTTA 240 

10 

GAGTTCAAAC AATTAAAAAG CCAGGTAATT ATATTTTTAA CAAATATCGA TTAGAGATAC 300 

ATTCTAATTT ACCAAAATGT TTATTTCCGC TTACAGTGAG AACACGACAA AGTGGCGATA 360 

CATTTAAACT GAATGGGCGC GATGGTTATA AGAAAGTGAA TCGCCTGTTT ATAGATTGTA 420 

75 

AAGTGCCACA GTGGGTTCGG GATCAAATGC CAATCGTATT GGATAAACAA CAGCGCATTA 480 

TTGCGGTAGG AGATTTATAT CAACAACAAA CAATAAAAAA ATGGATTATA ATTAGTAAAA 540 

20 ATGGAGATGA ATAGCGTTAT GCATAATGAT TTGAAAGAAG TATTGTTAAC TGAAGAAGAT 600 

ATTCAAAATA TCTGTAAGGA ATTGGGAGCA CAATTAACAA AGGATTATCA AGGTAAACCA 660 

TTAGTATGCG TGGGTATCTT AAAAGGCTCA GCAATGTTTA TGTCAGATTT AATTAAACGA 720 

25 ATTGATACCC ATTTATCAAT TGATTTCATG GATGTTTCTA GTTATCACGG AGGCACTGAG 780 

TCAACTGGTG AAGTTCAAAT CATTAAAGAT TTAGGTTCTT CTATTGAAAA TAAAGACGTA 840 

TTAATTATTG AAGATATCTT AGAGACTGGT ACTACACTTA AGTCAATTAC TGAATTATTA 900 

30 

CAATCTAGAA AAGTTAATTC ATTAGAAATA GTTACTTTAT TAGATAAACC AAACCGTCGT 960 

AAAGCGGACA TTGAAGCTAA GTATGTAGGT AAAAAAATAC CAGATGaATT TGTTGTTGGt 1020 

TACGGTTTAG ATTATCGTGA ATTATACCGA AACTTACCAT ATATCGGTAC GTTAAAACCT 1080 

35 

GAAGTGTATT CAAATTAATT TTTTAATCAA TTTCAGTTAT TATTACTATG CGTTTGAGAA 114 0 

ATAATAGTGT AGACTCAAAA ATATGAAAAA TGTATTTCAT ATATATTTAA TTTTAGACAA 1200 

GACATATGTC TTGAAAAGTT GAAAAATATA GAGATTGATA AAACTAATAC GGGTGTGAAT 1260 

40 

GACATTGATG TTAAGCTCAA TTACTAGCTT ATAAAACATG TCATATGTTA CAATTTTTGT 1320 

TAGTTTTATT ATGGGAAGTA GGAGGAAATG ACGCATGCAG AAAGCTTTTC GCAATGTGCT 1380 

45 AGTTATCGTA ATAATAGGCG TTATTATTTT TGGTCTATTT TCATATTTAA ACGGTAATGG 1440 

AAATATGCCG AAACAGCTTA CATATAATCA ATTTACTGAG AAGTTGGAAA AAGGTGACCT 1500 

TAAAACTTTA GAAATCCAAC CACAACAAAA TGTCTATATG GTAAGTGGTA AAACGAAAAA 1560 

50 TGATGAAGAC TATTCATCAA CTATTTTATA TAACAACGAA AAAGAATTAC AAAAAATTAC 1620 

TGATGCTGCT AAAAAGCAAA ACGGTGTAAA ATTAACGATT AAAGAAGAAG AAAAACAAAG 1680 
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TTTCTTCCTA AGCCAAGCAC AAGGTGGCGG TAGTGGCGGT CGTATGATGA ACTTTGGTAA 
ATCTAAAGCA AAAATGTACG ATAATAATAA ACGTCGTGTT CGTTTCTCTG ATGTAGCAGG 
GGCAGATGAA GAAAAACAAG AATTAATTGA AATTGTTGAT TTCTTGAAAG ATAATAAAAA 
ATTCAAAGAA ATGGGATCTA GGATTCCTAA AGGTGTCTTA CTTGTTGGAC CTCCAGGTAC 
TGGTAAAACA TTACTTGCTA GAGCGGTTGC AGGTGAAGCT GGCGCACCAT TCTTCTCTAT 
TAGTGGTTCA GACTTTGTAG AGATGTTTGT TGGTGTTGGT GCGAGCCGTG TTCGTGACTT 
ATTCGATAAT GCTAAGAAAA ACGCGCCTTG TATCATCTTT ATCGATGAGA TTGATGCTGT 
TGGTCGTCAA CGTGGTGCAG GTGTTGGTGG CGGTCATGAT GAACGTGAAC AAACCCTAAA 
CCAATTATTA GTTGAAATGG ATGGTTTCGG TGAAAATGAA GGTATCATTA TGATAGCTGC 
TACAAACCGT CCTGATATCC TTGACCCAGC CTTATTACGT CCAGGTCGTT TTGATAGACA 
AATTCAAGTT GGTCGTCCAG ATGTGAAAGG CCGTGAAGCA ATTCTTCATG TTCATGCTAA 
AAACAAACCA CTTGATGAAA CGGTTGATTT AAAAGGAATT TCACAACGTA CACCTGGTTT 
CTCAGGTGCT GATTTAGAGA ACTTATTAAA TGAAGCATCT TTAATTGCTG TACGTGAAGG 
TAAAAAGAAA ATTGACATGA GAGATATCGA AGAGGCAACG GATAGAGTTA TAGCCGGACC 
TGCTAAGAAA TCTCGAGTTA TTTCTAAGAA AGAACGTAAT ATTGTTGCTC ATCACGAAGC 
TGGTCATACA ATTATCGGTA TGGTACTTGA TGAGGCAGAA GTAGTGCATA AAGTTACTAT 
TGTTCCACGT GGACAAGCAG GTGGTTATGC AATGATGCTA CCTAAACAAG ATCGTTTCTT 
AATGACTGAA CAAGAGTTAT TAGATAAAAT CTGTGGTTTA CTTGGTGGAC GTGTATCAGA 
AGATATTAAC TTTAACGAAG TATCAACAGG TGCTTCAAAT GACTTCGAAC GTGCAACACA 
AATCGCACGC TCAATGGTTA CGCAATATGG TATGAGTAAA AAATTAGGAC CATTACAGTT 
CGGTCATAGC AATGGTCAAG TATTCTTAGG TAAAGATATG CAAGGTGAGC CTAATTATTC 
AAGCCAAATC GCATATGAAA TTGATAAAGA AGTTCAACGA ATCGTTAAAG AACAATACGA 
ACGTTGTAAA CAAATTTTAT TAGAGCACAA AGAACAATTA ATTTTAATTG CTGAAACATT 
ATTAACAGAA GAAACATTAG TTGCTGAACA AATTCAATCA TTATTCTACG AAGGTAAATT 
ACCTGAAATT GATTATGATG CAGCTAAAGT TGTTAAAGAT GAAGATTCTG AATTTAATGA 
TGGTAAATTC GGTAAATCTT ATGAAGAGAT TCGTAAAGAG CAATTAGAAG ATGGACAACG 
TGACGAAAGT GAAGATCGTA AAGAAGAAAA AGATATTGCT GAGGATAAAA AAGAAGCTGA 
TAAATCTGAT GAAAAAGATG AACCAGCACA TCGACAAGCC CCAAATATCG AAAAACCTTA 
CGATCCAAAT CACCCAGACA ATAAATAATC GATTATATTC AGTACCTCTT TCTATGATAA 
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AATTGTTATA GCAGAAAATA ATTGTAAAAC 


AAGTTACTTC 


ATTATTTAGA 


ATGATGGGTG 


3600 




TAGAATAAGT ACAATTGTTG 


CATTTTATGA 


AGTAAAGTAA 


TTT'ITTAAAT 


ATAGAGTAAT 


3660 


5 


AGAGGAGATT GAAATAATGA 


CACACGATTA 


TATTGTTAAA 


GCATTAGCAT 


TTGATGGAGA 


3720 




GATTAGGGCT 


TATGCTG CTT 


TGACAACTGA 


AACTGTTCAA 


GAAGCACAAA 


CGAGACATTA 


3780 




TACATGGCCG 


ACAGCATCTG 


CTGCAATGGG 


AAGAACAATG 


caCAGCAACA 


GCTATGATGG 


3840 


10 


GCGCAATGTT 


GAAAGGTGAT 


CAAAAATTAA 


CTGTCACTGT 


AGATGGCCAA 


GGACCTATTG 


3900 




GACGAATTAT 


TGCCGATGCA 


AATGCTAAAG 


GCGAGGTGCG 


TGCTTATGTA 


GACCATCCAC 


3960 


15 


AAACT CATTT 


TCCATTAAAT 


GAGCAAGGTA 


AACTTGATGT 


AAGACGAGCG 


GTAGGGACAA 


4020 


ATGGATCTAT 


A X \J\3 X X w A X 


AAAGACGTTG 


GAATGAAAGA 


CTATTTCtCT 


GGAGCAAGTC 


4080 




Pa ATTGTTTP 


AGG AG AA PTT 


GGTGAAGATT 

«w x unnwi x x 


TTACTTATTA 


TTATGCTACA 


AGTGAACAAA 


4140 


20 


CACCTTCATr 1 

Vh^TVv WAX V-*J» X V* 


GGTAGGTfTT 


GGTGTATTGG 


TAAATCCTGA 


TAATACGATT 


AAAGCAGCAG 


4200 




GAGGATTTAT 




ATG CCAGGTG 


CCAAAGATGA 


AACAATTTCA 


AAATTAGAAA 


4260 




AAGCAATTAG 


TGAAATGAPA 


GGAGTTTPTA 

W \>A W X X x ^» x <n 


AATTAATTGA 


ACAAGGATTA 


ACGCCAGAAG 


4320 


25 


GATTACTAAA 

^J^X X X X ^^^^^^ 


CGAAATPTTA 


GGTG AAG A C*C 


ATGTGCAAAT 


TTTAGAGAAA 


ATGCCTGTTC 


4380 




AATTTGAATG 


T A A TTGT A GT 


CATGAGAAAT 


TTTTAAATGC 


TATTAAAGGA 


TTGGGCGAGG 


4440 




CTGAGATTCA 


AAATATGATT 


AAAGAAGATC 


ATGGTGCTGA 


AGCAGTATGT 


CATTTCTGTG 


4500 


30 


GAAATAAATA 


TAAATATACT 


GAAGAAGAAT 


TAAACGTGTT 


GCTAGAAAGT 


TTAGCGTAAT 


4560 




TTAATTTAAA 


TCAATACG CT 


AAAATGTTTA 


T1TITAGCGG 


TTTAGTGAAA 


TGTAGAACTA 


4620 


35 


AATAGTTGTA 


TAATCCTTAG 


TGATTTTGTT 


TGCTTTCTAG 


AATTTATTTG 


ATAAAATAAT 


4680 


TCTATATCCG 


ATAAATAAAC 


TAAGATITCA 


ACAACTAACT 


AAAAAGGAGT 


GTTCTTAATG 


4740 




GCAGAAAAAC 


CAGTAGATAA 


TATTACTCAA 


ATTATTGGCG 


GTACACCGGT 


AGTCAAATTG 


4800 


40 


AGAAATGTAG 


TAGATGACAA 


TGCAGCAGAT 


GTTTATGTAA 


AATTGGAATA 


TCAAAATCCA 


4860 




GGTGGTTCTG 


TAAAGGATAG 


AATTGCTTTA 


GCAATGATTG 


AAAAAGCAGA 


GCGAGAAGGC 


4920 




AAAATTAAAC 


CTGGCGATAC 


AATTGTAGAA 


CCAACAAGTG 


GTAATACAGG 


TATCGGTTTA 


4980 


45 


GCATTTGTAT 


GTGCTGCTAA 


AGGATATAAA 


GCAGTATTTA 


CTATGCCCGA 


AACAATGAGC 


5040 




CAAGAGCGTC 


GTAATTTATT 


AAAAGCATAC 


GGTGCGGAAT 


TAGTTTTAAC 


GCCTGGATCA 


5100 




GAAGCGATGA AAGGTGCAAT 


TAAAAAAGCT 


AAAGAATTGA 


AAGAAGAACA 


TGGTTACTTC 


5160 


50 


GAGCCACAAC 


AATTTGAAAA 


CCCTGCGAAC 


CCTGAAGTTC 


ATGAGTTAAC 


TACAGGTCCT 


5220 




GAGTTATTAC 


AACAATTTGA - 


AGGGAAAACT 


ATCGATGCGT 


TCCTAGCTGG 


TGTTGGTACT 


5280 
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GTTGCTATAG AGCCTGAGGC TTCTCCAGTA TTGAGCGGTG 
TTACAAGGTT TAGGTGCTGG ATTTATTCCA GGCACTTTGA 
5 ATTATTAAAG TAGGAAATGA TACAGCGATG GAAATGTCTC 

GGTATTTTAG CAGGTATTTC ATCAGGTGCT GCGATTTATG 
GAATTAGGAA AAGGTAAAAC AGTAGTAACA GTATTGCCGA 

10 

TCAACACCTT TATATTCATT CGATGACTAA TTAATGTCAT 
TTTGAGATAA CTTGCTCTTT TTTTCTACCA TGTATATTTT 
TAAACATTTT TCTGATAAAA ATATCCAGTG AATGATAAGA 

15 

AACTAGTAAA TAGCAGGAGT AAATTTTATT AGAGTTAAAC 

TTAACATGAC TAAAACAAAA ATTATGGGcA TATTAAACGT 
20 ATGGTGGAAA ATTTAATAAT GTTGAATCAG CTATAAATAG 

AAGGTGCTGA CATTATAGAT GTTGGAGGTG TTTCAACGAG 

CATTAGAAGA TGAGATGAAC AGAGTATTAC CTGTTGTTGA 
25 (2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10401 base pairs 

(B) TYPE : nucleic acid 
30 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

35 

TAGATACTGG GnTAAAcaTc AAAAATAtyT GCtTaTTCaC GTGTTTAcGc TCCCtCAAAC 60 

GCAACGTTAA TTGCGTGTAA TCATTTAGTG TGAATTcAGA CGCTTCTTCC ATGACTATGT 120 

CTGATATGCC TTTTATCGAC TTTATTTTCT CTGGGTTATC TAATCCTTTA AACAAAAAAA 180 

40 

CTGCGCCGTT TGGCAATTCA ACTTTGTTAT CAGTCTTATT CCAAAGGCAC ATGTCCCAAA 240 

TACCAAAGTT TATCAAACAA TCTTTAACAT CTTCGAACAA ACTATCTTTA ATTGTTGATT 300 

45 GTACTTTTCT AAGCCACAGT ATACGCCTAG GATATTTCCA ATCTTGCAAT GCTTTGAGTA 360 

CAACTTTTTG TATAACGCCG TGAGACTTAC CGCTCGAACC TCCACCGTAA TGkACTTCAG 420 

TGAAGTtATC GTAATTGGTT AGTATTTCGA ATATGTTTCT ATTGAAAACA TTAGACGGTT 4 80 

50 TGTTAAAGTT TAATTTAACT TTCGTCATCG TACTCACCAA TATTAATCTC AATATTCTTC 540 

TGAGTAATTT CTTTTTTATC GATATACGCA CCATGTACTT TTAGTATGTG GTCAATAGAT 600 



GTGAGCCAGG TCCACATAAA .5400 

ATACAGAAAT CTATGACAGT 5460 

GTCGAGTTGC TAAAGAGGAA .5520 

CTGCCATTCA AAAAGCAAAA 5580 

GTAATGGTGA ACGCTACTTA 5640 

TTAAAAGAGT GAGTTATCTT 5700 

TAAAAATATG AGCGTTAAAT 5760 

TAATAAACGT ACATACTAAT 5820 

AATACATAAT TAAAGGGTGG 5880 

CACACCTGAT TcATTCTcAG 5940 

aGTGAAAGCC ATGATAGATG 6000 

ACCCGGTCAT GAAATGGTTT 6060 

AGCTATTGTC GGTTT 6115 
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x 1 1 AHA I Kj<j 1 


CAT ATTT C TT 


ACTGTAAGCC 


TCTTGAGGTT 


CTCCTCTAGC 


AATAGAAGCA 


720 




ft AT A A rnrTa 

OA 1 nnLuL 1 A 


AAGCTTCTGT 


AATACTCATT 


AAACGCTCTT 


CTTGTATCTG 


TTCTAATCGT 


780 


5 


lAAi Ai 


ATTCCGAAAC 


ATTAACATTT 


CTTAACAATC 


GACTTGCTAA 


AGACTCTGCT 


840 




GTTTTCTTAC 


TATAACCTGC 


TGTAATTGCT 


GCTTTTTTAC 


CATTACATCC 


ATTCATTATA 


900 




TATTCATCTG 


CGAATCTCTT 


TTGTTTTTCG 


TTCATTTCAT 


TTACCACCAA 


CTCTCGCGCT 


960 


10 


ATACGCTTTT 


TAAAATTAAA AAAGGATTGG 


CTATAATCAG 


CCAACCCACA 


TAGATCCTTT 


1020 




ATTCCTAATT 


GCGATAAGGG 


AAACGCAGTA 


CGATAGTCAA 


TATCCTACAC 


TATCATAATA 


1080 


15 


TCTCATTTAA GGTATCAAAA ACTGCCACTT TACTGCCAAT TTCAGTCTTC CCCTAACTCT 


1140 


TCCG CCAATC 


TAn ZVT ITr! A T 


TTTTCTTTTG 


All t.TATGAG 


CAGTTCTATC 


AGAAATGTGT 


1200 






nnnLi 1 x ^n\. 


1AA1 1 1 X 


TTTATThTA A TV HP 

1 1 Al 1 AAAAT 


AATACTCTTG 


AATGAATTCG 


1260 


20 


CGTTCTTTrr 

WW 1 X W 1 i 1 V>\« 


Tit f w T w TYl A TY2 T 
10^.1 xoaxVjx 


vjr 1 xuAi 1 A1A 


1 x LftAlAb 


CGCTCTTAAA 


CTCAAGGATT 


1320 




TTACCTCTTC 


GTATACTACA 


A 21 it IV T A ATTA 


f^TTAr'TfTr'PA 


TTTCTGTTTT 


CGATGTATTA 


1380 




GACGGTACAA 


ACTCCCCGCC 


TAT ATTTWFA 
1AXA1 X iVjlM 


1 1 1 xvAatAA 


1 LLAUjij 1 Vj 1 


CATTATTTCA 


1440 


25 


CTTCTTAAAT 


CTTCAAGTTG 


TTTATGATAA 


TTAf^ATAAT 
X X nuutn X AA X 


r* iv c iv n a & rr^ 


ATCTTCTAAC 


1500 




TTTCGAACTG 


TTGATAATTT 


TAATCCGTAT 


TTPTTTTTAf? 

X llyi 1 XXX AO 


TPATftA ATAf* 

X \—f\ X V7AA X AV_ 


V-s- 1 1 ALIA 


1560 




AATATGTTTA 


ATCTTCAAAG 


TGTCTCAATC 


TArTTPTTAA 

x x x v» x x nn 


T ATPTrT A TP 
1/11 X ^» X nl V* 


1 V_ x V-kJ\- 1 CI 1 


1620 


30 


TAACTTTTAC 


ATCACCTTTT 


AACTGTTCCG 


CTTGTAACAT 


CACACCAAAC 


AATAAGATGA 


1680 




CTAGTAATAT 


AATTGCTATG 


ATTAACCACA 


TCATCTACTC 


CGACACCTCC 


GCCCTCATCA 


1740 


35 


AATCAGACTG 


ATCACTCAAC 


TTTGCGAAGT 


CACTTGGCGC 


CTCTACATCA 


TCATTAGCCG 


1800 


TCATCATAAT 


ATATACTTGC 


TCAGTTACAT 


ACTTACCTAA 


CTCATACATC 


GCTAGTAAGA 


1860 




ATAATAGTCT 


CAAAATTTCT 


TTAACCACCA 


CTAAACACCC 


CATGTTAATT 


TATCGATAAT 


1920 


40 


TTGTATAGCT 


TGTTTTAATG 


CGTCTCTTTT 


TTCTTTGATA 


T CT CT ATT AT 


CGCCATCTTC 


1980 




ATCAGCTGAC 


ATTAACTCAC 


TGTCATATTC 


ATATAATAGT 


TCTGATATTT 


CATTACTAGC 


2040 




TACTACTAAT AAGTTTTCAT 


CTACATCAAT 


CGTTACCGTT 


TTCTTTGGCA 


TCTCCATCTC 


2100 


45 


TCCTTATCTT 


AACTTGTGCC 


TCGTATTTGC 


GCTCAGCTTC 


TTCTTTACTC 


TCTGCCTCAA 


2160 




CAACTGTAAA 


CGTCTGATTA 


TCTCTAGCAG 


TAGTAAAATG 


TTCATGTGGT 


TGTCCTGTTG 


2220 




AATCTTTGAA 


TGTTGTGACT AAGTATTGCG 


TCACTTCTTA 


TCACTCCTTT 


GAATGATTCT 


2280 


50 


AAGTTTTTCT 


ACGAATAAAA 


GTATTAGTAC 


AACACTCAAT 


GTAGCCAACA 


TATTTTTTTG 


2340 




. CTTTGCAAAA 


TCTACTATAA 


CGATTAAGAC 


TAATAACATT 


CCAATTCTGC 


ATGTAAATAA 


2400 
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1 AuAACjTAI 1 


GGAACTAATG 


TAATGATGTA ACTCACTTCC 


CCAAAACCTC 


CTTGACTCGA 


2520 




TCTAAGATGT 


CTTTACACTC 


CGCTACTTCC 


GAAGCCTTTT 


TCTCCACGTT 


^i#"r*i/-> ft m ft ft ^ftvM 

CTGAAACACT 


2560 


5 


TTCGAATTCC 


TCCACTTGCT 


TTAGTTCAGG 


TGTCCATATA 


GGCACGATAA 


>h >^ ft ft #lUf|/l ft +m 

C CAATTGAG C 


2640 




TAGTTTGTCT 


CCTTCGTTGA 


TTTGATAAGT 


TCCGTATTGT 


CTTATGGCGT 


CACTCAAATC 


2700 




GATTTCTCCT 


TTAATATCAA AAACACCTGG 


TGTGATATAA 


CCATTCGATG 


CAATAGCGTC 


2760 


10 


ATTCTTGATA TTAATCCCTA AATTGCCGTG 


ATATCCCGCG 


TCTATCTTGC 


CTGTTTCAAT 


2B20 




CACTAAATGC 


GTTTTACTAC 


TTACACCACT 


ACGGCTAGTT 


AATAGTCCGA 


CATAGCCCTC 


2880 


15 


TGGTATG CTT 


ACAGCTACAT 


CTGTTTTAAT 


CACTGCCTTT 


TCTTGTGGCT 


CAAGTACGAC 


2940 


AGTTT CAGCT 


GAGAATATGT 


CATAACCTGC 


ATCCGTCTTA 


TGATTTCGTT 


CGGGCATTCT 


3000 




AGCATTTTCT 


GATAATAGCC 


TTACTTGTAA 


TGTGTTAGTC 


ATTTTCCTGC 


TCCTCCCTAG 


3060 


20 


CTGTAGCAAA 


CGCTATTCTC 


AATTTCAATC 


TTTCAACAAT 


ATGAATTAGT 


GCGGTATTGA 


3120 




GGAATATTTC 


AAATTCTTCA 


ATGTTCTCAT 


CTATAAAATC 


AAGTATTTCT 


TCCTCTTGTT 


3180 




CACTGTCAAA 


CTCG CTTAGT 


ACATCCCAAA 


TATTTATGTC 


GCTTTTGCTC 


GTTTCTAATA 


3240 


25 


CTCTTTTGAT 


TATTTCTGAA 


TTACTTTTAT 


TACTCATTTT 


CCTTGTTCCT 


CCTCATATTT 


3300 




ATAGACAACT 


TGACCTGCCA 


TAATCCCTAC 


TGCTTCATCA 


AGTTCAATAC 


CTTCTTTAAC 


3360 




TGAATGTTGA 


ATAGCATTTG 


TCATTCCCTC 


AAGTATTTCA 


TCAAACGCTT 


GTGCTCTCTT 


3420 




ATACACGTCC 


TCAATCTCTT 


TTAGTAATCC 


CTCTGTGTCA 


TTACCGTTAT 


ACGCACTAGC 


3480 




ACTGATCACT 


GATTGTTCAA 


TTTGTTCGCG 


GTTATTCATC 


ATTTCCATCT 


CCTCTAAAAT 


3540 




AAAGTTAGTT 


GCTTCTGCTC 


CTCGTATTCC 


AAACCATGTT 


GCTTTATATA 


TGTTTCGAGC 


3600 


35 


TCTTCCGCTG 


TATCAAATGT 


CTTTTTCACG 


CCTTGCCAAC 


CTGGCACGAT 


ATGCCCATGa 


3660 




AAGTAATAAG 


TGCCGTTCAC 


TACATGGATA 


TGTGCCACTC 


GTTCGTTATC 


CTGATACAGA 


3720 


40 


TATCTCTTAG 


ATCCGAAAAA 


TTGGTTTAAG 


TATTCTTTAC 


ATGCGCTATC 


GGTTTTAGGC 


3780 


ATTTATGCTT 


CCTGCCATTT 


CTTAAACATT 


TGGTTATAAG 


TAGTATCAAA 


CCAGTACGGA 


3840 




TCACGTGAAT 


GTTTTTGAGG 


CACATTAAAC 


AAATGTGGCT 


TCTTCTTACG 


TAGTTCAGCC 


3900 


45 


TCTTTACGTC 


GTTGCCTAGC 


CATTTCACGC 


TCTTTGCTCT 


CTCGCTCCAT 


GATTTTGGAT 


3960 




AACACAATTT 


CTTTATACTC 


AGCTAAGCGC 


ATACCATAAG 


GTGCATGTAA 


GGCTTCTAAC 


4020 




AACGCCCAGC 


CACCTCGTAC 


TCTTTTTGCA 


ACCATTCCTG 


GAGTTAAACC 


GTTCTTTTTT 


4080 


SO 


ATCAATTCAT 


TTTCATGTTC 


GGTAAATTTA 


TATGGTTTAC 


CGTTAATCTT 


TACGATACTC 


4140 




ATTTATTCCA 


CCTCTATACA 


TTTACTTTTT 


TTAATCCAAT 


CCTCTAATTT 


GTGCGTGTTG 


4200 
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ACATTTAAGT 


TAACCATCTC 


AGCTTTTCCG 


TTTTTATATC 


CACTAATAGT 


TGATCTTGAT 


4320 




ACGC CAGTTT 


CATTGTGCAA 


ATCTTGGACA 


CTTACGTTAT 


CTCTAGCCAT 


GATTACCCTT 


4380 


5 


AAA 11 AG 1 rG 


CGAATACTtC 


GTTCAACTTC 


ATTTATTCCA 


CCTCTATATA 


TGCATGTCTT 


4440 




ATTGTTATGT 


TGTCATACTT 


TAGTAATT CG 


TCCGGATTGT 


CATCTAAGCG 


CTTTGCCAGC 


4500 




C»TATC i i"i 1"1 


CTTTATCCAC 


ATCATCGTAA 


TGCTGATATT 


CAACTTCTGT 


AGGTATTCTT 


4560 


JO 


ATATCAATCG 


TTGCGTTTAT 


ATATGCTTGT 


TGTTGCATTA 


GATCACTTCA 


TTTCTCTTTT 


4620 




TCnTTACGT 


CTGACTTTCA 


CTAAGTCCTC 


ATATACCATC 


CATTCTTGAC 


CTGTGTATTT 


46B0 


15 


AGGCGCTTTA 


CATATCCACG 


TTAAATTCAC 


ATCTCTATAC 


TGATATCTGA 


ATATCTTCGC 


4740 


TTTGATGTTG 


GCAACTTCAG 


TCGCCTTACC 


TTTAACGTCT 


ATAACTTCAA 


CCAGTTTCCC 


4800 




TTCCTTCCAC 


AAAGAGAAAT 


CGGCTATATA 


CGTAATCGGT 


CTTTGTTTCC 


CGAATTTAGG 


4860 


20 


TTGTAATTCA 


AATTTCGGTT 


GTATTTCGAT 


ACGATCATAG 


TTAGTGCCAT 


TCATATTACT 


4920 




TTCTAAATAT 


TGGTAATATT 


CGCACTCTAC 


TTTGCTATCA 


AATACAATTC 


CTTTGTACTC 


4980 




AACTTTCTTA GCATTGTATT 


TACTCATTGT 


GCCACCTCTA 


AATATCAAAT 


ATCGTTGCTT 


5040 


25 


GCAATCCTAG 


CTCTTGCTCA 


TATAGAAGCC 


CGTGAGCGCC 


TTTGAATCGT 


TTTAGGTCAC 


5100 




TATCAGTCAT 


AATTTTCTTT 


TCGTCGCTGA 


AATGGGCTCC 


TGTGAGCGAA 


TAAACTTCAT 


5160 




TTACGTTGTC 


TTTATACTTG 


ATGACCTTAA 


TATCTTCCGT 


GCCATCTTCT 


CGGTATAAGT 


5220 


30 


AATATTTTTC 


TTTCGGCATT 


TTTTAACACT 


CCTTAATGTG 


TGTTTTCTTC 


CAGTTGATTT 


5280 




CATTCATGAT 


TTTCTTTTCA 


ACTCTGTCGT 


AATCATCGAA 


AGGCGATAAC 


TCGTTATTGT 


5340 




CCAACAATCT ATTGACCGCC 


CAACCAGTCT 


CGATATATAC 


ATTTGCTACA 


ATCGGGTCGC 


5400 


35 


TTTGCTTTGT 


CTCTTCATAC 


ATCGATTTCA 


ATAAGCTTTT 


GAATTGCATT 


ATATTCATGT 


5460 




gaaaaacctc TGAGTCTTCT TGTAATACTC AAATTCAATT 


ATTCCGGTTT 


CGCCGTCTTT 


5520 


40 


GTTTTTGGCT 


ATGTTACATT 


CAACAATAGA 


TTTGCCAGTG 


ATACTGTCAT 


CTTCGTCACG 


5580 


GTTATAATAA TCATCACGGT 


AAAGTAGCAT 


CGCTAAACTC 


GCATCTGCTT 


CTATTCCGCC 


5640 




TGATTCTTTC 


ATGTCCGATA 


GCATTGGTCT 


TTTATCCTGT 


CTAGACTCGA 


CACCACGATT 


5700 


45 


CAGTTGTGAA AGTAGTACGA 


TGATTGCGCC 


TGTCTCGTTA 


GCGATTATCT 


TTAAGTCACG 


5760 




TGATATCTTT 


TCTACTGCTA 


CACGTCTATC 


AACTTTCGCA 


TCAGTATCCA 


TCAGTTGAAG 


5820 




ATAATCTATA 


AAAATAACTT 


GTTGCCTGTC 


TGAATGCCTC 


ATTGtTGCGC 


TCGCACATCT 


5880. 


50 


TGCGGTGTGA 


TATTACTTTT 


ATCAGAAATA 


TCGATGCCTA 


ATTTCATGAT 


TTTATCCATC 


5940 




GCATTCGTTA ACTTTGTTAA GTCATCCGGC GTTAAGTTCC 


TGATTTCTTT 


TATCTTTGTT 


6000 
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AGACTAAAGA 


AAGATGTTTT 


GTATCCATTT 


TGTGCTATGT TCAGCATCAT GTTTAATGCA 


6120 




AAACCTGTCT 


TACCCACTGA GGGACGCGCT 


GCGATGACGA 


TTAATTGTGA 


TGGTTCTAAT 


6X80 


5 


CCCCCTATTT 


TGTAATCCAT 


TAGCTTGTAA 


CCCGTCTTAA 


TTTGCTTCTT 


AGGGCTATCG 


6240 




CTGTATAACT 


CTTCGACAAA 


CTCCTCAACA 


AACTTCTTGG 


TTCCATCTTC 


TTTTTTGTTA 


6300 


10 


GTAATTGTTT 


TTAAATCCTT 


GAGTTCATCA ATCAAGTTGT 


TAAAGTTTTG 


GTTCGTAGGT 


6360 


TGTTGTTTGA 


ACTCAGTTAC 


CAATTCGTTA 


GCTTTGTTGA 


GCTGATAACT 


TTCCAATAAT 


6420 




TCTTGTTGAT AACGTTCAAA GAAGCCATAT 


CCAATuAAAT 


CGGAGTTGTA AAGTTTAGTT 


6480 


15 


A I AQaTATCTG 


CATCTAAAAA 


TTCTTTATCT 


TTAGTTGCTT 


TTAAATAGAT 


TTCTTGATGA 


6540 


TCTATCTTTC 


CGACGTCCAT 


TACATAATTG 


~ \ 

AAAAAGGTTT 


TAAACTTTTC 


GTTCGTAAAC 


6600 




ATGTAATCTT 


TAACTCTTAT 


CTTTTCTAAT 


ACGTCCGGTT 


GTTTAAGTAG 


CGTAGCGATT 


6660 


20 


ATTGTACTTT 


CAATTTCGAA 


TTGTCCGTAA 


TTCATTCGTT 


TTCGCCCCCA 


AATTCTGCCA 


6720 




ACTTATTCAT 


GAACTTATCT 


AGCGCTATTT 


TTCTTTGTCT 


GACATATTCG 


GGGTCATTCT 


6780 




GCATTTTCCA 


TTGGTGTGTA 


GCGGTTTCGT 


TATCTACTGG 


CTCGATAGAT ACTTTTTTAG 


6840 


25 


GTTCCTTACG 


CATGATTGCT 


GGTAAGTTAG 


GCGGGTACGG 


GTTGTTACTG 


TTGATATAAA 


6900 




CATCTACCGC 


TTTTACAGTT 


GGTTGATAAT 


CTCCATTTTG 


ACTTAATACA 


TCAATCCACA 


6960 




TTTCTAACTT 


CGGTTTATCA 


AAATCAATGT 


TGTATACGTA 


CCTAACTTTT 


TTAATAATTT 


7020 


ou 


CTAATGCTTG 


TGTTTTGCTC 


ATCGGCATTA 


GTCATCACTC 


AATTCTTTTT 


CCATTTGTGC 


7080 




AATGACATCA 


TCAGTAGTAT 


TTTTTCTAGG 


TGCTATTTTA 


TTTTCTGCAT 


CTTCTTTTGT 


7140 




TTTGACATTC 


TCTTTAGCCC 


AGTTGTTTAA 


AACTTTAATT 


AAATAGCCAC 


CATGCGCACT 


7200 


35 


TTTGCTTTTA 


GTGTACTCAA CACCTACTTT 


TACAACTTCA 


AAAGCGTTTG 


TACCTATATC 


7260 




ATCAATAGCA AACCCTAATT GTTCCATTTG ATTAGGTGTT AACTTATCAT CCAAATTTGC 


7320 


40 


AATTATATAT 


TTTATTGAAG 


ATGAGAAGAC 


GGCTTCTCTT 


TCTTCTTCTT 


TATTCTTATA 


7380 


TTCTTCTTCT 


TTTTCTTCTT 


CTCTTTCTTC 


TTCTTCTTCT 


GTATCGTTAC 


GTAACGTTAC 


7440 




GGTAACGTTA 


CGTTTTGCTT 


CTAGTAACTT 


TTTCTGTTTC 


TCACGATAGC 


GTTGTTGTCG 


7500 


45 


CAATTTATTT 


TTTTCTTTAT 


GCTTAGCTTT 


GCTATCTAAG 


CTTTGATGCT 


TCTCCCAGTT 


7560 




TGTCACTTTT ATGACACCAT TAAOTTTTC 


AATCATGCCC AATGTCTCAA AAGTTTGAAT 


7620 




TGCTAACCTT 


ATTGAGTTAA 


TAGGTCTATT 


AAATTCATTT 


GCTAACATTT 


CTTCGTTGTA 


7680 


50 


CGGCAAGTTT 


TCGGATAGCA 


TAATATAACC 


TTGTTCATTG 


TACTTTCCTG 


ATAAAGTTAG 


7740 




TAACTTAACC 


CAAATAGTTA 


TGATCGTATC 


TCTTTCGGGT AAAGCTTCGA TATATTTGAT 


7800 
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10 



15 



20 



30 



35 



40 



45 



50 



55 



ctcctttcag 


CATTTTGTTG 


AGCCTCTCAT 


CAACTTTTAT 


CCACGAGTCA 


TGCAAGTGAT 


7920 


ATTTATCATC AAACGACTTA ACGCCAATTG 


CGTGCTGTTC 


ATTATGATGT 


TGTCTACACA 


7980 


GTGCTAACAC ATGTTTGTCG TAGTGATTCA TTTTGTTTCT GTTCATGCCT CTGCCGACTG 


8040 


CTTCATAATG 


TGP PAGGTPT 


P. PGTG A PP. PT 


TTCCGCATAT 


TACACAGTTG 


CGGTTGATTG 


8100 


TAGCCCAATA 


TAATAA PG PT 


TTATPTTPPP 


TTAACAACTT ACTCGTTTCT ACACTCATAG 


8160 


GTATTTGATO 


ATP A A AP ATA 


AAPPPT&TH A 


TCAGTTCTAT 


TAACTCCCTT 


GCAACTTTCA 


8220 


t a p. a a p aptp 


\3\.\3 CAvj A\_ lu 


A i 1 1 CTTCAT 


AACCTTTCAT 


AATTTCCAAT 


TCTGTTTGTA 


8280 




AVjI lXiAl J. 


ACTGGTTCGC 


CCCAGTGAAG 


TTCTATATCT 


CTACACATTG 


8340 


PP A A TA TTTT 


TTTGCGTTGT 


TCTATAGATA 


GTTTTTTATT 


GTCCGGAACC 


TCTACTTCTG 


8400 




ATATCCGTTT 


TCTAGTAAGT 


CAATGTGACT 


TTGTTCAAGT 


TCAACACCAG 


8460 


1 AljCAACVaAC 


GGAATAAGTA 


CCGTCATTGT 


CTTTCTGGTA 


TCTTGTAATG 


TATTGCATTT 


8520 


AnftL CACvj X C 


CTAGAACGGT 


t\ « » rriy™in *n**>* *n 

AAAT CAT CAT 


CATTGATTTC 


TATTGGACCA 


TTAGCATTAG 


8580 


ClaAAl LrtjVj 1 1 


TGATTGTTGA 


CTCATTGGCG 


TCTGTTTCCC 


ATTTGCTTGC 


TGTTCTTTTT 


8640 


bill LAiLiL 


AT CAG TTTT A 


GCaTTCTGGTT 


TATTAACTAC 


TTCATCGTCT 


TTATTCCAAA 


8700 


V-l 1 1 1AL-A1A 


TGAGAGT CTT 


ACAAAAl ACT 


TGCCTTGTTC 


CTCGTTAAAT 


TTATTTTTAA 


8760 


CZT A P & A T AP T 


x LLuAi 1 1 IVj 


TTAATTAATT 


GATCTGTGTC 


AAaAGTTAAA 


TCTGGTAAGT 


8820 




X ^.L lrin ILIA 


LiAAkjl/iAL 1 


CGATATATTG 


TTTTTCTTGA 


TAATCTTGTT 


8880 


GCAATGP.TPP. 


P.APPAATTGP 


1 lului 1 X V7 1 


ATTGTTTAC C 


TTCGTTGTTT 


TCAAAAACAA 


8940 


TCGTGAAP.TA 




PTY1TPPTT& A 


AC 1 CviACAl I 


TGCAACTTTT 


ACTGTAAATT 


9000 


CTCCAGCTCC 


TAAJLAAGTPP 


PPAPPTTTP A 


TGAATGCCTC 


TTGATTAGTT 


TCTTGAATGT 


9060 


ATTGTGTTCT 


AP.CAGTGATT 


TTPATA ATTT 
X X \*A x r\i\ X X X 


TTATACCGTC 


CTTTTAATTA ATTTTTAATT 


9120 


ACCATTTCTA 


ATTGCTTGTA 


CAACATCGTT 


AATACTTGGA 


TTAATGAAAC 


GTTTGTTGTT 


9180 


AATTTTGATG 


TTGCTTGAGT 


GTCTTATCTT 


TGTCTCGAAT 


AAATTTGATG 


GTTCAGCGTT 


9240 


AAGTACATAT 


TGATAAGTTT 


TTTCGCCGTC 


TTGCTCATGT 


TCTTCTATTG 


TCATTCTTGC 


9300 


TAACACGTCA 


GATTGACTGA 


TGACTGCTTT 


TTTTATTTGG 


TCTTGTGCCT 


CTATCGTGAT 


9360 


TGTTGGATTG 


ATAGTACTTC 


CCTCATCATC 


TTTGTCTTTG 


TTAATGCCCT 


CGTGTCCGCT 


9420 


TATAGCAAGA 


TGAAATTGAT 


AATGTTCTTG 


TAATTTAGAA 


ATATAACGAT 


AAATACTTAC 


9480 


AATGCGTGTA 


GCACACTCGC 


CCCAATCATT 


AAATGTCGGT 


TTCTTTGATT 


TACCGTCCAT 


9540 


GATGTCGTCC 


ATAGTGATAT 


CACGTAACTT 


TTGGATTGTT 


TCAATCACTA 


CAACATCAAT 


9600 
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\ 



s 



{ 10 



15 



20 



AAAATGCTTA 


TAATTCTTAA 


TCTGCACAAC 


TGCCCCATCT 


TCTGTTACCG 


TTGTTCCGTC 


9720 


CTCATTTATA 


TCTAGTACTA 


AGGCATTGTT 


ATCTTTTGTT 


AAAAACGTAG 


TTTTACCAGT 


9780 


ACCGAACTTG 


CCGTATATCG 


CAAATTTATA 


AAACTTGTTT 


GCATTTTGTT 


TGCTGATGTC 


9840 


TTTTACACCT 


AGTTGCGTTA 


AAATATCGAC 


ATCTTGATTA 


GTTTTTTCAG 


TCATCTATTC 


9900 


TCCCACCTTT 


ACCGTGTATG 


ACGTTGGTTT 


CTCCACAATG 


CTAGCACCCT 


CTAAAACTTC 


9960 


GCCGTTTGCG 


TCAATCAATG 


TGCCGTTTTC 


AGTTACATTG 


AAATCTTTCT 


TAATGTCTGA 


10020 


TTGGCTAAGT 


TTTTTAGTTA 


CTTTTACATA 


GTTGTCAAAA 


CCTCGTTGCT 


CAAGTTGTnT 


10080 


AATGACTTCT 


TGCTCATTGC 


TAACTTGAAT 


GACTTTTGAA 


CCTTTTCTGG 


CTGTCACTTT 


10140 


TCCGTAAGtG 


TATTCAACTT 


GAATTTGCTA 


TCTTGTTCTT 


TTTGTATTCT 


GTAATATTCA 


10200 


ATTACAAGGC 


TTTGTAAATA 


TTCTTTGCCA 


CTCTGTAATT 


TTTCTACTTC 


TTTATCTTTC 


10260 


CATTCGTTTA 


TGCGTTCAAT 


TTCTTTATTT 


GCTAAATCGT 


TGATTTCATT 


CTCTTTAGTT 


10320 


GTGATTGCAT 


CCAGTTTCTn 


AAAAACCCAG 


TTAGCACTGT 


CTAGATCAGT 


nACTTTGAAT 


10380 


CGGTCGTCTT 


GTTCGAATGT 


n 








10401 



25 (2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2989 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

35 



40 



45 



50 



TTTCTCTCTA 


TTATTCTCGA 


TGCGTAGATA 


ATTGTTTAAA 


TTTAAGTTTA 


TAGTAATGTT 


60 


GAGTTTATAA 


TTTCATATAT 


CTAAAAACAG 


GTGTTGTATA 


TATAATCATT 


CATCTAGTTA 


120 


TACTTACTTT AAAAATAATA 


TAATTTCATG 


CGATGCAATT 


CATTGATGGA 


TGTTTTTAAT 


180 


CTTAATCAAA 


TCCAaATAAA 


GCATATATTT 


TTAAATTCAC 


TTTCTTTCGA 


ATCGATTTTT 


240 


ATCTCTTGnA 


TTAAACTTTT 


CCATTGTTTC 


ATTAAAGCTC 


TCTGTCATAT 


CTATTCCCAT 


300 


TGAATTCGCT 


AAACATAACA 


ACACAAATAA 


ATTATCACCT 


AATTCTGCTT 


TAATCGTATT 


360 


TGCTTCCTCT 


GAATCTTTCT 


TCTTTTTTTC 


ACCATAGGTA 


TGATTTATTT 


CACGTGCAAG 


420 


TTCGCCCACT 


TCTTCAGTCA 


ATCTAGCTAA 


GTTAGCTAAT 


GGTGAAAAAT 


ATCCTGTTTT 


480 


AAATTGTCCA ATATATTCAT 


CAACTTCACG 


TTGCATTTCT 


ACCATTGATT 


TCATTTCTAC 


540 


GTTCTCCTTA 


TATTGCATTT 


CTAATATAGT 


ATATATCAAT 


TTGAAGTCTC 


ATGCATGTTT 


600 
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AATTCAGTTT ATATAAATGT AATGCATTCC TAACTAAATT AAATCAATTG AAATTGGGAT 720 

TATAACTTTA TGATACGTAC CACTACAATA AAATAATATA GTGAATAATC TACCATTAGA 780 

AAAATAAGCA CAAAAAAACT AGCAACCACA CAAAAATGTG ATTAGCTAGT TAATAAGTGT 840 

CTAATTTAAG TTAATTGTTA ATCTATAAGA TTAATCACTT GAACGCGCAA TCAAAATAAT 900 

ACGTACAAGC TCTGCTACAG CGACTGCAGT TGCTGCAACA TAAGTCATTG CTGCTGCAGA 960 

TAATACTTTA CGCGCATGCT TGTATTCTTT TTCATTTACA ATGTTCAATG CCGTAATTTG 1020 

TTTCATCGCT CTTGAACTCG CATCAAACTC AACTGGTAAC GTAACAATTG AGAATAATAC 1080 

CGCTAATGAC ATTAAACCAG CACCAATCCA TAAAGCAGTT GAACCaAATG CACTACCTAT 1140 

CGCTGTTAAG ATAATACCTA ACATGATGAT CATATAACTT AATGAACTCC CTAGGTTTGC 1200 

AACAGGTACT AATGCTGCTC TGAATCTTAA GAACCAATAT CCTTGGTGAT CTTGAATGGC 1260 

20 ATGACCAACT TCGTGGGCTG CAATTGCAGT TCCAGCAACT GATGGTCTGT CATAGTTTGC 1320 

AGGAGATAGT GAAACAACTT TCTTTTTAGG ATCGTAATGA TCTGTTAAGA ATCCTTCACC 1380 

TTTAACAACT TCGACATCAT AAATACCGTT TGCATGTAAA ATTTCTAATG CAACTTCACG 1440 

25 ACCCGTTTTA CCACTAGTTG ATCTAACTTG TGAATATTTC TCATAGTTAG ATTTAACTTT 1500 

GTGTTGTGCC CATAAAGGAA GCACCATTAA TATTACGAAA TAAATTATCA TAGTAAAAAT 1560 

TGAAGACAAT AAACTCACTC TCCTTTATAA ATATTTTACT GTCATTTGCC GTTTTTATCA 1620 

AATCATTTAC ACTTTAATAA TTTGTTTAAT TCAATATAAA GCAAAAGTCC AAAAACACTT 1680 

AGACAACATG ATAATACACC AATTTGCCAC ACATGTGTAG TTATAAAATC ATAATATGGA 174 0 

AATTGAAGGT GAAAATAGTC AATATAATCA TTCAAAAACA CCCAAATCAT yGCTACACTG 1800 

ATTCCAATCA TAGAACGTTT AAACCTAGGA TAGAAGTAAA TTGCCTGAAC AGCCATTATA 1860 

CTGTGGGAAA ACATTAATAC CAAACCATTT ACTGTAATAT CACCTTGTTC AATAATAAAT 1920 

AATATATTCA TTATAACTGC CCAAATCCCA TATTTGAATA ATGTTACAAA TGCCAGTGCA 1980 

TCGATAATAC TATTTTGTTT TTGAATTAAT ATCAATGAGA TAGAAATAAC TAAGTATAAT 2 04 0 

ATTGCAGTTG GGCTATCTGG AACAAAAATC TTAAAATGCC AGGGCGTATG ACTTAATTGT 2100 

^5 TCACCATACC ATATATAACC ATAAATCATC CCTAATATAT TACAAATGAG TAGCATCATT 2160 

AACCAAGAAC GTTGATAAAG TGTATATTGC CAAAATGCTT TAATTGTCAT CTGCTAAGTC 2220 

CTCAAATTGA TTATGTTTAT TTACTAGCTT GAGTGTATTT AAAATTTGCG TTAGTTGATA 2280 

50 AAAACGTTGC TTTTCATTCA TCTGTAAACT TAAATCAATA TTGTGTAACA AGTAATCTAT 2340 

TAATAACGCA TGTTTATGCC GATCTATAGC CATACTATTT AAGTCATGAA GATAAGTTTG 2400 
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TGACACGTTT GCGAAGTGAA TTTGAATATC AAAAGCACAG TTATGATTAG CGATATAATC 2520 

AAATATTTCA TTTGTATTCA TTAACTTTAT ATTACGCTTA GTAAATTGAA TTGCAGAAGC 2580 

5 GTGACTTCCC ACTTCTGCAA TTTCTAATGT TTCATGATGA TTAATTTTTG TATCTACAAA 2640 

ATGAATGTTT GCCAATTTCG CCTCATTCAC TTTTATATAG TTAAGCACCC AAACTGCAAT 2700 

ACGCGACTTA AATCGATATT GAAAAAGTAA ATATTCAATA AAACTTTCTT TAATTTGATT 2760 

10 

GAGTGTCTCT GACATCAAAT ACCCCATTTT AAGATTGCAA TCTTGaTAAT TCGTCATGCC 2820 

AATTTTCGTT ACTTGGcTCT AGTTCCAACA ATTGATTTAA AATAGTAATT GCTTGTTCCT 2880 

TTTGACCAAT TTCAATTAAA TAGAAATAAT AATCACTCAT AAAATCAATA TTTGTTTTCA 2940 

75 

TCGTTGGATA TGCTAATTCA AAGAAATGTT GAGCTTCTTT ATCTCGCTC 2989 
(2) INFORMATION FOR SEQ ID NO: 151: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1143 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 
CATCAACTCC TTAATTACAC TGTAAATGAT ATGCGTCTTT TTGACAACTA TATTTGTCAA 60 
ATCTACACCA AAAAATATGA TTATCCACCT ATGTATGACA TTTTGAAACA AACACCTCAA 120 
CGCCTACAAG TCATAATTGT TTACTTTCGT TACACCTTCC TGCATAATTA ACAGCATTCT 180 
AATTTTAGTA TGATGCACGC ATTTTCACTA AATCAAACCA TTCAAAGGAG ACTATTATGG 24 0 

35 

CATTTACATT ATCTGCAATT CAACAAGCAC ATCAACAATT TACTGGTGTT GACTTTCCAA 300 
AACTATTCAA AGCTTTTAAA GATATGGGGA TGACTTACAA TATCGTCAAC ATTCAAGATG 360 
GCACTGCAAC ATACGTACAT CAATCAGAAG ATGATATCGT TACGTCATCT GTAAAAAGTA 420 

40 

ATCATCCTGT TGCTCAAAAA TCAAACAAAA CAATAGTTCA AGACGTCTTA ACTAGACATC 4 80 

AACAAGGGCA AACAGATTTT GAAACATTTT GTGATGAAAT GGCTGAAGCT GGCATTTATA 54 0 

45 AATGGCATAT CGATATTCmA GCGGGCACTT GTACTTATAT CGACTTGCAA GACCAAGCTG 600 

TTATTTCAGA ATTAATCCCT CAATAAACTA TATTTATAGC AACATTTTAA TTATTTCATA 660 
AAATTTTATT GATAATCATT ATCGTTCGGT ATAAAGTAAA TACTATATAC TACTTATGAG 720 

SO TGAGGTTGAT TATCATGATA ACTAACACTT TTATTTTAGG CATCACAGGC CCAACAAGTC 780 

TTGTCGTCAT TAGCATTATC GCTTTAATTA TTTTTGGTCC GAAAAAATTA CCACAATTTG 84 0 

55 



769 



EP0 786 519 A2 



AGTCTCACGA TACACCCAGT AAGGAATCGA AACAACAGCG ■ AGAGCAATAG CACTGACCAC 



960 



ACCTTACTGG TTCACTTTAG CGAACTACGC CATCGGTTAG TAAAAATTTT ATTGTCGTTC 



1020 



GTCATTACGG TCATCGTCGT ATATGTyTCA TCATTTTGGT GGATGACACC ATTCATAACG 



1080 



TATATyACCC GgCACATGTG TcCTTACATG CATTTcATTC ACAGAAATGA TACAAATAAC 



1140 



GTG 



1143 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7953 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:. 152: 

CAACGCCTGA ACGTAAACCA TATCGTTTCG CGATTTCCTC ATCTTGACTA TTTACTAAAA 60 

ACTCTCTCAT GGCGATTAAT GTTTC TTTT T CTTCTTTAGT TAATGGTAAT TCTAACTCAG 120 

CTGCTTTTTG ACGCAAAGTT GGATGACCAT CTCTAATGAT GTCTTTCATT GTTAACATAT 180 

ATTGCACCTT CCTTATTTTA ATTTGTTTTA GTTGAATGAC AGTAAAAAGG TTGTTAAGAT 240 

ACTCATACAT TTTTATGTGT AAATATCTAC AAAGTTAACC AACTACTGCC AATGTTTATT 300 

TTAGATAGTA TATGTAAATT TTCAaGAtAT GCgTAATTGC gTTAAAAAAT GaTTAAAGTG 360 

TTGGTTTCAA GCAATGaTAC TTTAGAAATT TATTTATCAT CTTGACTTTA AAAATTATAT 420 

TATAAATGAC GTAACTGTCA ACAGATATAC TTAGTArTGA AGATGTGTAA TGTAATTGTT 480 

TAAAATTGAT TTCCAAGCAG ATTTTATTTA TCATTTAATT TAAATAGCAA GTGGAGGTAC 540 

AAGTAATGAA ATTTGGAAAA ACAATCGCAG TAGTATTAGC ATCTAGTGTC TTGCTTGCAG 600 

GATGTACTAC GGATAAAAAA GAAATTAAGG CATATTTAAA GCAAGTGGAT AAAATTAAAG 660 

ATGATGAAGA ACCAATTAAA ACTGTTGGTA AGAAAATTGC TGAATTAGAT GAGAAAAAGA 720 

AAAAATTAAC TGAAGATGTC AATAGTAAAG ATACAGCAGT TCGCGGTAAA GCAGTAAAGG 780 

ATTTAATTAA AAATGCCGAT GATCGTCTAA AGGAATTTGA AAAAGAAGAA GACGCAATTA 840 

AGAAGTCTGA ACAAGACTTT AAGAAAGCAA AAAGTCACGT TGATAACATT GATAATGATG 900 

TTAAACGTAA AGAAGTAAAA CAATTAGATG ATGTATTAAA AGAAAAATAT AAGTTACACA 960 

GTGATTACGC GAAAGCATaT AAAAAGGCTG TAAACTCAGA GAAAACATTA TTTAAATATT 1020 

TAAATCAAAA TGACGCGACA CAACAAGGTG TTAACGAAAA ATCAwAAGCA ATAGAACAGA 1080 
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AAGAAAAGCA 


AGACGTTGAT 


CAATTTAAAT 


• AATTAATATA 


ATACAGATGG 


TAGGAAACAA 


1200 


CTAATACAGT 


TCCTATTATC 


TGTATCTTTT 


TTTATTAAAA 


CAGAACTTTT 


TCAAATGGTT 


1260 


TAACAGTCCC 


ATTTATTTGT 


GGTACAATTA 


GTAAGGATAA 


AATGAATTTC 


TATACAATTA 


1320 


TGGGAAAGGT 


ATTGTGAATT 


GAATGGCTCC 


TAAGTTACAA 


GCCCAATTCG 


ATGCAGTAAA 


1380 


AGTTTTAAAT 


GATACTCAAT 


CGAAATTTGA 


AATGGTTCAA 


ATTTTGGATG 


AGAATGGTAA 


1440 


CGTCGTAAAT 


GAAGACTTAG 


TACCTGATCT 


TACGGATGAA 


CAATTAGTGG 


AATTAATGGA 


1500 


AAGAATGGTA 


TGGACTCGTA 


TCCTTGATCA 


ACGTTCTATC 


TCATTAAACA 


GACAAGGACG 


1560 


TTTAGGTTTC 


TATGCACCAA 


CTGCTGGTCA 


AGAAGCATCA 


CAATTAGCGT 


CACAATACGC 


1620 


TTTAGAAAAA 


GAAGATTACA 


TTTTACCGGG 


ATACAGAGAT 


GTTCCTCAAA 


TTATTTGGCA 


1680 


TGGTTTACCA 


TTAACTGAAG 


CTTTCTTATT 


CTCAAGAGGT 


CACTTCAAAG 


GAAATCAATT 


1740 


CCCTGAAGGC 


GTTAATGCAT 


TAAGCCCACA 


AATTATTATC 


GGTGCACAAT 


ACATTCAAGC 


1800 


TGCTGGTGTT 


GCATTTGCAC 


TTAAAAAACG 


TGGTAAAAAT 


GCAGTTGCAA 


TCACTTACAC 


1860 


TGGTGACGGT 


GGTTCTTCAC AAGGTGATTT 


CTACGAaGGT 


ATTAACTTTG 


CAGCAGCTTA 


1920 


TAAAGCACCT 


GCAATTTTCG 


TTATTCAAAA 


CAATAACTAT 


GCAATTTCAA 


CACCAAGAAG 


1980 


CAAGCAAACT 


GCTGCTGAAA 


CATTAGCTCA 


AAAAGCAATT 


GCTGTAGGTA 


TTCCTGGTAT 


2040 


CCAAGTTGAT GGTATGGATG CGTTAgcTGT nATATCAAGC 


AACTAAAGAA 


GCACGTGACC 


2100 


GCGCAgTTGC AGGTGAAGGT CCAACATTAA TTGAAACTAT 


GACATATCGT 


TATGGTCCTC 


2160 


ATACAATGGC 


TGGTGACGAT 


CCAACTCGTT 


ACAGAACTTC 


AGACGAAGAT 


GCTGAATGGG 


2220 


AGAAAAAAGA 


CCCATTAGTA 


CGTTTCCGTA 


AATTCCTTGA 


AAACAAAGGT 


TTATGGAATG 


2280 


AAGACAAAGA 


AAATGAAGTT 


ATTGAACGTG 


CAAAAGCTGA 


TATTAAAGCA 


GCAATTAAAG 


2340 


AGGCTGATAA 


CACTGAAAAA 


CAAACTGTTA 


CTTCTCTAAT 


GGAAATTATG 


TATGAAGATA 


2400 


TGCCTCAAAA 


CTTAGCAGAA 


CAATATGAAA 


TTTACAAAGA 


GAAGGAGTCG 


AAGTAAGCCA 


2460 


TGGCACAAAT 


GACAATGGTT 


CAAGCGATTA 


ATGATGCGCT 


TAAAACTGAA 


CTTAAAAATG 


2520 


ACCAAGATGT 


TTTAATTTTT 


GGTGAAGACG 


TTGGTGTTAA 


CGGCGGTGTT 


TTCCGTGTTA 


2580 



45 CTGAAGGACT ACAAAAAGAA TTTGGTGAAG ATAGAGTATT CGATACACCT TTAGCTGAAT 2640 

CAGGTATTGG TGGTTTAGCG ATGGGTCTTG CAGTTGAAGG ATTCCGTCCG GTTATGGAAG 2700 

TACAATTCTT AGGTTTCGTA TTCGAAGTAT TTGATGCGAT TGCTGGACAA ATTGCACGTA 2760 

50 CTCGTTTCCG TTCAGGCGGT ACTAAAACTG CACCTGTAAC AATTCGTAGC CCATTTGGTG 2820 

GTGGCGTACA CACACCAGAA TTACACGCAG ATAACTTAGA AGGTATTTTA GCTCAATCTC 2880 
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CTATTAGAAG TAATGACCCA GTCGTATACT TAGAGCATAT GAAATTGTAT CGTTCATTCC 3000 

GTGAAGAAGT ACCTGAAGAA GAATATACAA TTGACATTGG TAAGGCTAAT GTGAAAAAAG 3060 

AAGGTAATGA CATTTCAATC ATCACATACG GTGCAATGGT TCAAGAATCA ATGAAAGCTG 3120 

CAGAAGAACT TGAAAAAGAT GGTTATTCTG TTGAAGTAAT TGACTTACGT ACTGTTCAAC 3180 

CAATCGATGT TGACACAATT GTAGCTTCAG TTGAAAAAAC TGGTCGTGCA GTTGTAGTTC 3240 

AAGAAGCACA ACGTCAAGCT GGTGTTGGTG CAGCAGTTGT AGCTGAATTA AGTGAACGTG 3300 

CAATCCTTTC ATTAGAAGCA CCTATTGGAA GAGTTGCAGC AGCAGATACA ATTTATCCAT 3360 

TCACTCAAGC TGAAAATGTT TGGTTACCAA ACAAAAATGA CATCATCGAA AAAGCAAAAG 3420 

AAACTTTAGA ATTTTAATAC ATTTTAAAAG TTAACGAAGT TAGCGTATTT TAGTCTCATT 34 80 

GATTAAAATG AAATGTTTAA TTTACGAAAT CTTAGGAGGG CAAAAACGTG GCATTTGAAT 3540 

20 TTAGATTACC CGATATCGGG GAAGGTATCC ACGAAGGTGA AATTGTAAAA TGGTTTGTTA 3600 

AAGCTGGAGA TACTATTGAA GAAGACGATG TTTTAGCTGA GGTACAAAAC GATAAATCAG 3660 

TAGTAGAAAT CCCATCACCA GCATCTGGTA CTGTAGAAGA AGTTATGGTA GAAGAAGGTA 3720 

25 CAGTAGCTGT AGTTGGTGAC GTTATTGTTA AAATCGATGC ACCTGATGCA GAAGATATGC 3 780 

AATTTAAAGG TCATGATGAT GATTCATCAT CTAAAGAAGA ACCTGCGAAA GAGGAAGCGC 3840 

CAgcAGaGCA AGCACCTGTA GCTACTCAAA CTGAAGAAGT AGATGAAAAC AGAACTGTTA 3 900 

AAGCAATGCC TTCAGTACGT AAATACGCAC GTGAAAAAGG TGTTAACATT AAAGCAGTTT 3960 

CTGGATCTGG TAAAAATGGT CGTATTACAA AAGAAGATGT AGATGCATAC TTAAATGGTG 4 020 

GTGCACCAAC AGCTTCAAAT GAATCAGCTG CTTCAGCTAC AAGTGAAGAA GTTGCTGAAA 4 080 

CTCCTGCAGC ACCTGCAGCA GTAACATTAG AAGGCGACTT CCCAGAAACA ACTGAAAAAA 414 0 

TCCOTGCTAT GCGTAGAGCA ATTGCGAAAG CAATGGTTAA CTCTAAGCAT ACTGCACCTC 4200 

ATGTAACATT AATGGATGAA ATTGATGTTC AAGCATTATG GGATCACCGT AAGAAATTTA 4260 

AAGAAATCGC AGCTGAACAA GGTACTAAGT TAACATTCTT ACCTTATGTT GTTAAAGCAC 4320 

TTGTTTCTGC ATTGAAAAAA TACCCAGCAC TTAACACTTC ATTCAATGAA GAAGCTGGTG 4380 

AAATCGTTCA TAAACATTAC TGGAATATCG GTATTGCAGC AGACACTGAT AGAGGATTAT 4440 

TAGTACCTGT TGTTAAACAT GCTGATCGTA AGTCTATTTT CCAAATTTCA GATGAAATTA 4 500 

ATGAATTAGC TGTTAAAGCA CGTGATGGTA AATTAACAGC CGATGAAATG AAAGGTGCTA 4560 

50 CATGCACAAT CAGTAATATC GGTTCAGCTG GTGGACAATG GTTCACTCCA GTTATCAATC 4 620 

ACCCAGAAGT AGCAATCTTA GGAATTGGCC GTATTGCTCA AAAACCTATC GTTAAAGATG 4 680 
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ATGGTGCAAC TGGCCAAAAT GCAATGAATC ACATTAAACG TTTATTAAAT AATCCAGAAT 4 800 

TATTATTAAT GGAGGGGTAA AACATGGTAG TTGGAGATTT CCCAATTGAA ACAGATACTA 4 860 

TAGTAATCGG AGCAGGTCCT GGTGGATACG TTGCAGCAAT TCGTGCAGCT CAATTAGGAC 4 920 

AAAAAGTAAC AATCGTTGAG AAAGGTAATC TTGGTGGTGT TTGCTTAAAC GTAGGATGTA 4 980 

TTCCTTCAAA AGCATTACTA CATGCTTCTC ACCGTTTTGT TGAAGCACAA CATTCTGAAA 5040 

ACTTAGGTGT TATTGCTGAA AGTGTTTCTT TAAACTTCCA AAAAGTTCAA GAATTCAAAT 5100 

CATCAGTTGT TAATAAATTA ACTGGTGGTG TTGAAAGCTT ACTTAAAGGT AACAAAGTTA 5160 

ACATCGTTAA AGGTGAAGCA TATTTCGTAG ATAACAATAG CTTACGTGTT ATGGACGAAA 5220 

AGAGCGCACA AACATACAAC TTTAAAAATG CAATCATTGC AACAGGTTCA AGACCAATTG 5280 

AAATTCCTAA TlTCAAATTC GGTAAACGTG TTATCGACTC AACAGGTGCT TTAAACTTAC 5340 

20 AAGAAGTACC aGGTAAATTA GTTGTAGTTG GTGGAGGATA CATTGGATCA GAATTAGGTA 5400 

CAGCATTTGC TAACTTTGGT TCAGAAGTAA CCATCCTTGA AGGTGCTAAA GATATCTTAG 5460 

GTGGCTTCGA AAAACAAATG ACACAACCTG TTAAAAAAGG TATGAAAGAA AAAGGTGTTG 5520 

25 AAATCGTTAC TGAAGCTATG GCTAAATCAG CTGAAGAAAC AGATAACGGA GTTAAAGTTA 5580 

CTTATGAAGC TAAAGGCGAA GAGAAAACAA TCGAAGCTGA TTATGTATTA GTAACTGTAG 5640 

GTCGTCGTCC AAACACAGAC GAATTAGGCC TAGAAGAATT AGGTGTTAAA TTCGCTGACC 5700 

GTGGATTATT AGAAGTTGAT AAACAAAGCC GTACGTCTAT CAGCAATATC TATGCAATTG 5760 

GTGATATCGT TCCAGGTTTA CCACTTGCTC ACAAAGCTAG CTATGAAGCT AAAGTTGCTG 5820 

CTGAAGCAAT TGATGGTCAA GCTGCTGAAG TTGATTACAT TGGTATGCCA GCAGTATGCT 5880 

TTACTGAACC AGAATTAGCT ACAGTTGGTT ATTCAGAAGC GCAAGCTAAA GAAGAAGGTT 5940 

TAGCAATTAA AGCTTCTAAA TTCCCATATG CAGCAAATGG TCGTGCATTA TCATTAGATG 6000 

ATACTAACGG ATTTGTTAAA CTTATTACAC TTAAAGAAGA TGATACTTTA ATCGGTGCTC 6060 

40 

AAGTAGTTGG TACTGGTGCA TCAGATATTA TCTCTGAATT AGGTTTAGCA ATTGAAGCTG 6120 

GTATGAATGC TGAAGATATC GCATTAACAA TCCATGCACA TCCAACATTA GGTGAGATGA 6180 

45 CTATGGAAGC AGCAGAAAAA GCTATCGGAT ACCCAATCCA TACAATGTAA TAACTGATTA 6240 

TCTATAAAGA TTCAGTCATT AAAAGCTGTA GCATATGCTA CGGCTTTTTT GTTTTAGGTA 6300 

AAGTAATGTA AGGAAATTGA TTTGAGATAT CGTTAACATG TGACATGCAT GTTATACTAG 6360 

CGATGCTAAT AAAAGAATTG AAATGGAGGG TTCAACAATG GAATATGAGT ATCCAATTGA 6420 

TTTAGACTGG AGTAATGAAG AGATGATTTC AGTGATAAAT TTCTTTAATC ATGTAGAGAA 64 80 
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AATTGTGCCT GCTAAAGCAG AGGAAAAACA AATTTTTAAT ACTTTCGAAA AAAGTAGTGG 6600 

CTATAATAGT TACAAAGCAG TTCAAGATGT AAAAACTCAC TCTGAAGAAC AAAGAGTAAC 6660 

5 AGCTAAAnAA TAATTCGTTC GAAATTAACA CAATTTAATA GGAA T TTT T C TTTAAAACTA 6720 

TTGCTAATAA AGCTATATTT TGATACCTTT ATCAAGTGTT AAACAAAATG TTTGATAAAA 6780 

GTAAACTTAA TATAGCTTTT TTAGGTGGAA AAATAAATGA ACATAGGTAA TAAAATTAAA 6840 

10 

AATCTTAGAA GAATTAAAAA TTTAACGCAA GAAGAACTTG CTGAACGTAC AGACTTATCG 6900 

AAAGGCTACA TTTCACAAAT AGAAAGTGAA CATGCCTCAC CAAGTATGGA AACTTTCTTA 6960 

AATATTATAG AGGTGTTAGG AACGACGCCA AGTGAATTTT TTAAAGACAG TGAAAATGAA 7020 

15 

AAAGTATTAT ACAAGAAGGA AGAACAAGTT ATTTATGATG AGTATGATGA AGGTTATATA 7080 

TTAAATTGGT TAGTTTCAAA GTCAAATGAA TATGATATGG AGCCATTAAT ATTAACTTTA 714 0 

20 AAGCCTGGAG CATCATATAA AAATTTTAAT CCATCAGAGT CTGATACGTT TATTTATTGT 7200 

ATGTCAGGTC AGATAACACT TAATTTAGGC AAAGAGATAT ATCAAGCACA AGAAGAAGAC 7260 

GTTTTGTATT TTAAAGCACG AGATAATCAT CGTTTGTCAA ACGAATCAAA CAATGAAACA 7320 

25 CGAATACTTA TTGTAGCGAC AGCTTCATAT TTATAGGGGG GATCTTATTT GGAACCGTTA 7380 

TTATCATTAA AATCAGTTAG TAAAAGCTAT GATGATCTTA ATATCTTAGA TGACATAGAT 744 0 

ATTGATATTG AATCAGGATA CTTTTATACA TTATTAGGTC CTTCAGGTTG TGGTAAAACA 7500 

ACAATTTTAA AATTAATTGC AGGGTTTGAA TATCCTGACA GTGGTGAAGT GATTTATCAA 7560 

AACAAACCAA TTGGTAATTT ACCACCAAAT AAACGTAAAG TGAATACAGT CTTTCAAGAT 7620 

TATGCATTAT TTCCACACTT AAACGTCTAT GATAATATCG CTTTTGGTTT GAAATTAAAA 7680 

35 

AAATTATCAA AAACCGAAAT TGATCAAAAA GTAACTGAGG CATTAAAATT AGTAAAACTT 774 0 

TCAGGTTATG AAAAAAGAAA TATTAATGAA ATGAGTGGCG GACAAAAGCA ACGTGTTGCA 7800 

ATTGCACGTG CTATCGTAAA TGAACCAGAA ATATTATTGT TAGATGAATC TTTATCCGCA 7860 

40 

TTAGATTTGA AATTGCGTAC TGAAATGCAA TATGAATTAC GAGAATTGCa ATCTAGATTA 7920 

GGtATTACAT TTATATTTGT aACACATGAT CCA 7953 
4S (2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 234 7 base pairs 

(B) TYPE: nucleic acid 

(C) . STRANDEDNESS : double 
so (D) TOPOLOGY: linear 
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GGCGTGATCA TACGACCGTC ATTCATGCTC ATGAAAAAAT ATCTAAAGAT TTAAAAGAAG 60 

ATCCTATTTT TAAACAAGAA GTAGAGAATC TTGAAAAAGA AATAAGAAAT GTATAAGTAG 120 

5 GAAACTTTGG GAAATGTAAT CTGTTATATA ACAGCACTAA TGATnACAAT CATTTTTTAC 180 

ATTTCTATAT GCTAATGTGG CAAGATGAGC AAAACTCATT TTGTGGATaA TGTTTaAAAG 240 

TCATACACAC CATACACAAG TTATCAACAT GTGTATAAyT cGcCAAATCT ATGTTTTTAA 300 

10 

GACTTATCCA CCAATCCACA GCACCTACTA CTATTACTAA GAACTTAAAA CCTATATAAT 360 

TATATATAAA CGACTGGAAG GAGTTTTAAT TAATGATGGA ATTcACTATT AAAAGAGATT 420 

ATTTTATTAC ACAATTaAAT GACACATTAA AAGCTATTTC ACCAAGaACA ACATTACCTA 480 

15 

TATTAACTGG TATCAAAATC GATGCGAAAG AACATGAAGT TATATTaACT GGTTCAGACT 54 0 

CTGAAATTTC AATAGAAATC ACTATTCCTA AAACTGTAGA TGGCGAAGAT ATTGTCAATA 600 

20 TTTCAGAAAC AGGCTCAGTA GTACTTCCTG GACGATTCTT TGTTGATATT ATAAAAAAAT 660 

TACCTGGTAA AGATGTTAAA TTATCTACAA ATGAACAATT CCAGACATTA ATTACATCAG 720 

GTCATTCTGA ATTTAATTTA AGTGGCTTAG ATCCAGATCA ATATCCTTTA TTACCTCAAG 780 

25 TTTCTAGAGA TGACGCAATT CAATTGTCGG TAAAAGTGCT TAAAAACGTG ATTGCACAAA 84 0 

CAAATTTTGC AGTGTCCAcC TCAGAAACAC GCCCAGTACT AACTGGTGTG AACTGGCTTA 900 

TACAAGAAAA TGAATTAATA TGCACAGCGA CTGACTCACA CCGCTTGGCT GTAAGAAAGT 960 

3/) 

TGCAGTTAGA AGATGTTTCT GAAAACAAAA ATGTCATCAT TCCAGGTAAG GCTTTAGCTG 102 0 

AATTAAATAA AATTATGTCT GACAATGAAG AAGACATTGA TATCTTCTTT GCTTCAAACC 1080 

AAGTTTTATT TAAAGTTGGA AATGTGAACT TTATTTCTCG ATTATTAGAA GGACATTATC 1140 

35 

CTGATACAAC ACGTTTATTC CCTGAAAACT ATGAAATTAA ATTAAGTATA GACAATGGGG 1200 

AGTTTTATCA TGCGATTGAT CGTGCCTCTT TATTAGCGCG TGAAGGTGGT AATAACGTTA 1260 

TTAAATTAAG TACAGGTGAT GACGTTGTTG AATTGTCTTC TACATCACCA GAAATTGGTA 1320 

40 

CTGTAAAAGA AGAAGTTGAT GCAAACGATG TTGAAGGTGG TAGCCTGAAA ATTTCATTCA 1380 

ACTCTAAATA TATGATGGAT GCTTTAAAAG CAATCGATAA TGATGAGGTT GAAGTTGAAT 1440 

4S TCTTCGGTAC AATGAAACCA TTTATTCTAA AACCAAAAGG TGACGACTCG GTAACGCAAT 1500 

TAATTTTACC AATCAGAACT TACTAAAAAT AAATATAAAT AAAGGATGAC GTGATTAATT .1560 

AAAACGTCAT CCTTTATTTT TTGGCAAAAA TAATTCTAGG TGCGTATGTA AAATAAATTT 1620 

SO GGCAGCATTT TAAACAGCAA ATAAAAGACG CCAATTAAAT TTATGACAAA TGTATCCAAA 1680 

ATTTAATAAG TGTGCTTATA TGCCCTTTAA ATTTAAAATT TTAATAGTCA ATAACAAGTT 1740 
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AAAAATAAGA ATTAATTATT TATATGTAAA CGGTTTCTAC CTCTATTTTA AATGAAATTT I860 

GTGACAAAAA AAGGTATAAT ATATTAATGA CATACAAAGA AATGGAGTGA TTATTTTGGT 1920 

5 

TCAAGAAGTT GTAGTAGAAG GAGACATTAA TTTAGGTCAA TTTCTAAAAA CAGAAGGGAT 1980 

TATTGAATCT GGTGGTCAAG CAAAATGGTT CTTGCAAGAC GTTGAAGTAT TAATTAATGG 2040 

AGTGCGTGAA ACACGTCGCG GTAAAAAGTT AGAACATCAA GATCGTATAG ATATCCCAGA 2100 

10 

ATTACCTGAA GATGCTGGTT CTTTCTTAAT CATTCATCAA GGTGAACAAT GAAGTTAAAT 2160 

ACACTCCAAT TAGAAAATTA TCGTAACTAT GATGAGGTTA CGTTGAAATG TCATCCTGAC 2220 

)5 GTGAATATCC TCATTGGAGA AAATGCACAA GGGAAAGACA AATTTACTTG GAATCAATTT 2280 

ATACCTTAGC TTTAGCAAAA AGTCATAGAA CGAGTAATGG ATAAGGGACT CCATACCGTT 2340 

TTAATGC 2347 

20 (2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

30 / 

ACAAGACGTn TCTATAACTT ATCTGAAATC GCTCGTCAAG ATAAAGATTA TGCAACTATC 60 

TCATTCTTAA ACTGGTTCTT AGATGAACAA GTCGAAGAAG AATCAATGTT TGAAACTCAC 120 

ATCAATTATT TAACTCGTAT CGGCGATGAC AGCAATGCAT TATATCTTTA CGAAAAAGAA 180 

35 

CTTGGCGCTC GTACATTCGA CGAAGAATAA TTAAACATCA CTACAATAGA CAGATAAATA 240 

TCATACGACA TGATAGGCAT TTGGGTCACT TACAATAACC CAATGTCTAT ATTATTTTGC 3 00 

TTTACGGAGA TCACTAGATT CATTTTCTGA ATCATTGATC TGCGTTTTTT CATTTTCAAG 36 0 

40 

GCTAATTATT GTATTTTTAG TCATTTATTT TTTAAACTAC TAATGTTAAT AACTCTAAAT 420 

TTGATGTTGA ATTAATTTGA CGATTTTAAA GCATATCATC ATTTACTTTT TAATCAGAGT 480 

45 TACATCCAAA TGATAGATTT CACGTTATAC CTTCACGTAT AATATTATGT ATCGTTTGTA 54 0 

AGCAAATGAC TAAAAGTCTA TTAATATATA CATTTAATTA ATTGAAAGGA TTGACTACAT 600 

GATACAAGAT GCGTTTGTTG CACTTGATTT TGAAACAGCA AATGGTAAAC GTACAAGTAT 660 

SO TTGTTCTGTC GGAATGGTTA AAGTCATTGA TAGTCAAATA ACAGAAACAT TTCATACTCT 720 

TGTGAATCCG CAAGACTATT TTTCACAACA AAATATTAAA ATTCATGGCA TACAACCAGA 780 
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aGATTTACCT GTTGTCGCAC ATAACGCGGC 


ATTTGATATG 


AACGTCTTAC 


ATCAAAGCAT 


900 




TCAAAATATT GGTTTACCAA CTCCAAATTT 


AACTTACTTT 


TGTAGTTATC 


AACTTGCTAA 


960 


5 


AAGAACCGTT 


GATTCGTATC 


GATACGGTTT 


AAAACATATG 


ATGGAGTTTT 


ATCAATTAGA 


1020 




TTTTCATGGT 


CATCATGATG 


CATTGAATGA 


TGCCAAAGCA 


TGCGCAATGA 


TTACTTTTAG 


1080 


JO 


GCTACTGAAA 


AATTATGAAA ATTTAACATA 


TGTAACTAAT 


ATTTATGGTA 


AAAATCTAAA 


1140 


AGATAAAGGC 


TAGGACTAAA TAAAATACTC 


CCTTCAAAAG 


TAAGCATTGT 


AAAAATGTAA 


1200 




ACTTTGCAGG 


GAGCTTTATT 


TTATATAAAG 


TCATATATCG 


TCATATTTTT 


ATAAGTTGAT 


1260 


15 


TGTTCTAAAT 


TACCTACAGT 


GACACCAATA 


AGTCGAATTG 


GTACATCAGG 


GTCTTTTAAA 


1320 




TCGTTATAAA 


GTAAATATGC 


AATATTATAA 


ATATCTTCTT 


CAGAACTAAC 


CGAATCTCTT 


1380 




AAACTCATCT 


GTTTAGATAG 


CGTTTCAAAT 


TGATAAGTTT 


TAATTTTAAC 


CGTTACAGTT 


1440 


20 


TTAGCTGACT 


TCTGTAATTT 


ATTTAGACGT 


TCAGCTGTTT 


TACCTGnACA 


ATTCCCATAC 


1500 




TTTTCTTAAA 


ATCTCTTCAT 


CATCATTCAC 


GTCTGTTGCA 


AATGTGCGTT 


CAGTCCCTAC 


1560 




TGATTTTCTT 


ACTCTTGATG 


ATTTCACTTC 


ACTATGGTCA 


ATACCGCGTG 


CCTTGTTATA 


1620 


25 


TAAACCCCGA 


CCTCTTTTTC 


CAAACAAACG 


TATTAATTCA 


AATTCCGTTT 


TCTCATATAA 


1680 




ATCTCTACCG 


TTAAAAATAC 


CATTATCATG 


CATTACTTTT 


TTGGAAGCTT 


TACCTACGCC 


1740 




TGGaAAATCT 


CCAATATCCA 


ATGTCATCAA 


AATATCATGG 


aCATTTTGAT 


AATCAATCAC 


1800 


30 


AGTCATACCA 


TCAGGTTTAT 


TCATACCACT 


CGCTAATTTA 


GCTAAAAATT 


TGTTATAAGA 


1860 




AACACCTGCA 


GATGCTGTTA 


AATGTGTCTG 


CTCTAGAATA 


TCTTTTCTAA 


TATACTGAGC 


1920 


35 


AATTTTCGAA 


GCAGGAAGGT 


CTGGTCTCAC 


TAATTCTGTA 


ATATCTAAAT 


ACGCTTCATC 


1980 


CAATGACATC 


GGTTCTACCT 


TATCTGTATA 


ACTTCGGAAA 


ATAGACATAA 


TCTGCGCAGA 


2040 




TGTTTCTCGG TAAGCACCAA AATTACTTGT 


GACAAAGTAT 


CCATTTGGAC 


ATAATTTATG 


2100 


40 


CGCTTGTGAC 


ATAGGCATTG 


CTGAATGGAC 


GCCGTATTTT 


CGTGCTTCAT 


AGGATGCCGT 


2160 


AGAGACAACA 


CCCCTACTGC 


TTGCTTTACC 


ACCAACAATG 


ACTGGTTTCC 


CTTTCAATTT 


2220 




GGGGTTATCT 


CTCATTTCGA 


CTTGTGCAAA 


AAAATAGTCC 


ATATCTATAT 


GAATAATTCG 


2280 


45 


TCTCTCAGTC 


AAGTGCTCAC 


CTCCCTACTA 


ATTTTTACTT 


TTATAACGCA 


CAAAAATATC 


2340 




TCAACATAAT 


TATACGCTGT 


GTACGATTTT 


TTTACATAAA 


TCTTGCACTT 


AGCGATAACT 


2400 




ATATTGaGAT 


AACTACAAGT 


TGTTATaAAA 


TCAATTGCTA 


TTTAAGCATG 


ATGATGAAGA 


2460 


50 


CGATTGAGTA AGAAAACATA GGTAATCTGA 


AATAATTCAA 


GCAAATTCAT 


TTTGTTGGTA 


2520 




TCATCATATT 


AAAATTTATT 


ATTGAGTCGG 


CTTTTGATGA 


TACAAATAAA 


TACTATCTTC 


2580 
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10 



15 



AAAGCAATAA GCGGTATGCA TACTAAACAT AAAAATAAGT GATGAATAAC CAAATACCTT 2700 

AATTAAAATA AGCAAGCCAG TACTTAATAG GATTAGTGGT GACAGCATAA TAATTGAGAA 2760 

TTGCCATTTG TTGAAGCAAG CATCTGCTGT TTGGAATAAG ATTCTGTCTT TTTTTATATT 2820 

AAACATAGGT TTGCTATCTT TTTTAAATAA AAGAAATAAT GCTCTATGGA TAAGTTCATG 2880 

TAAAATCAAT AAAATAATGA ATCCAGCAAA CCCATATACA AGATTGATGA TGATATTTTG 2940 

ATCGACAACC GCTGTGACAC CTAACGCCCA CTTATACGTA AATAAAATCA CGAATAACGC 3 000 

AATAACAAGT TGCAAGATAA TAAACCTTCG CATTTGAAAA TTATTTGTCG TTAAATCAAT 3060 

TTTATGCATT ACCAACCCTC CCGATCATGA CATTCTTATT CTTCTTTAAA TATAGTATAC 3120 

AATGTCACAT TTAATTTAAA AAGTTCATAT CAAGAAAGTA AATTGGCTGT AATAAAATTT 3180 

TAATATACGA CTTCTTTCTT CACTTATTAA GGCGAAATTT TATCtCAAAT CATGTGCGCT 3240 

20 ATTTCAAATT GAATAATGCC ACTGTCTCAA CATGTGTTGT TTGTGGAAAC ATATCTACCG 3300 

GTGTTACCTC TTCAAGTTGA TATTTTTCAG CTAATAATAA TGCATCACGT TGCTGTGTTG 3360 

CGGGATTACA TGAAATATAG ACAATACGCT TAGGTTCTAA TGTAAGCAAA GTCTGAATAA 3420 

25 ACGTTTCGTC ACAGCCCTTT CTTGGCGGAT CAACCATTAC AACATCTGGT TTAATCCCTT 34 80 

GTGCTTTCCA TTGTAAAATA ACTTCTTCAG CTTTCCCACA GACAAAAGTT GTATTATTGC 3540 

ATTGGTTTAT AGTCGCATTT TGTTGTGCGT CTTCAATTGC AGAAGGTACT ACTTCAACAC 3600 

CGTATACATG TTTTGCAAGT GGTGCCATAT ATAGCCCTAT TGTTCCAATA CCACAATAGG 3660 

TATCTAATAC AACTTCATTA CCTGTCAATT GCGCATACTC AATTGCTTTA TTATATAATT 3720 

TCTCTGTTTG TTCAGAATTA ATTTGGTAGA ATGACTGATC ACTTATTTTA AATGTACTAT 3780 

CTGTTAATTG ATCAATAATT GTATCTTTAC CATATAGCGT TATAGATTGA CGTCCCATAA 3840 

TAACATTAGA GTGGCTATCA TTAATGTTTT GTTTAATGCT TGTCACATTA GGAAATGCAT 3900 

CTAATATCTT CTCAACAACA GCATTTTTTT GTGGCCACTT TTTACCATTA GTTACAAAAA 3960 

TAATCATCAT TTCGTCTGTA TGATATCCTG TTCTTACAAC CAAATGTCTC ATTAAACCTT 4020 

TTTTCAATTG TTCTTGATAA ATACTTACAT TTAAATCTTT TAAAATAGAT TTAACTTCAT 4080 

45 TCATCACTTC TTGATGTTGT GAATCTTGTA TTAAACAACT TTCCATGTCA ATAATGTCAT 4140 

GGCTTCTTTG ACGATAAAAG CCCATAATAA CTTCATTCTG TTCATTCTTA CCAACTGGAA 4200 

TCTGGGACTT GTTTCGATAT CTCCAAGGAT CTGTCATGCC AACTGTATCG TTAATCTTAG 4260 

50 AATTATCAAA ATGCGCTTTT CGCTGAAACA AATTAATCAC TTGTTCCTTT TTCATTTCAA 4 320 

GTTGTGCTTC GTATGATAAG TGTTGAAGTT GGCACCCACC ACAACGTTCA TAATATATAC 4380 
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